Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use FlightRadar24 etc. to get additional Data #24

Open
michaelweinold opened this issue May 12, 2024 · 15 comments
Open

Use FlightRadar24 etc. to get additional Data #24

michaelweinold opened this issue May 12, 2024 · 15 comments
Assignees
Labels
enhancement New feature or request

Comments

@michaelweinold
Copy link
Member

Since AeroDataBox alone likely won't provide enough information to estimate the number of passenger on specific routes, we can use data from FlightRadar24 (or other sourcea) in addition.

Conveniently, there is already a Python package wrapping the API: FlightRadarAPI
I am feeding data to the site, so I have an active Business Subscription. This is required for API access. However, it seems from the documentation that they don't have historical data in the API at the moment. If this really is the case, we might need to look for data "manually" in the FlightRadar24 data archive.

There is also pyflightdata and the ADSBExchange API.

@dodedic, I suggest you contact the ADSBExchange team about the possiblity of receiving a data dump of historical data (perhaps for the case-studies):

Screenshot 2024-05-12 at 05 24 56
@michaelweinold michaelweinold added the enhancement New feature or request label May 12, 2024
@dodedic
Copy link
Contributor

dodedic commented May 14, 2024

FYI: E-Mail has been sent to them, still waiting on an answer.

@dodedic
Copy link
Contributor

dodedic commented May 16, 2024

@arebe337 About the PAX numbers on all the routes, we could also take the approach of estimating the average available seats on a all routes using FlightRadar24 like I did here. We take the aircraft types from FR24 and average the available seats, then assuming an average load factor we could derive an estimate for number of PAX.

This approach has several implications:

  • Big number of API requests for FR24, how is this limited on your account? @michaelweinold
  • It is still an "estimate", but we could validate the data using routes where we know the actual number of PAX like I did with Lagos-Abuja.

While ADB has a API call to get the aircraft types departing from an airport and on what routes, it is limited to the current date.
This can actually be done with a separate Tier 3 call in ADB!! more info in the comment below.

@arebe337
Copy link
Contributor

@dodedic what would we get from this Tier 3 call?

@dodedic
Copy link
Contributor

dodedic commented May 16, 2024

So I could provide a table of data @arebe337 , just like the one @michaelweinold suggested here.

Where we have the "average" available seats for each route in this format. Then using your script which uses the amount of flights we can put it together to give us the estimate of PAX. Like I mentioned above we benchmark the data using routes where we know the annual PAX traffic.

The API call I would use is this one. It's a Tier 3 request which we have 150'000 of still. In this request get the destination and the aircraft type (but crucially not the amount of flights...so the work done so far is still important).

I will first make a list of all aircraft types and attach an average available seats to it.
I would then for all 3'144 airports take the departing flights only and average across all departing flights to that destination the number of seats available.
This I would do for 7 days for 3 weeks (1 call = 12 hours so 2 calls = 1 day) with 3144'*2 calls'*7 days=44016 calls for 1 week. So we could do 3 weeks of the year for all airports with 132'048 calls of 150'000 available.

@dodedic
Copy link
Contributor

dodedic commented May 16, 2024

We could also trial this approach by using one specific route again first, if the Lagos-Abuja example from here is not convincing enough.

@michaelweinold Would love to hear your input on this method/estimation before I get to coding and making API calls.

@arebe337
Copy link
Contributor

So, @dodedic, are you referring to the 'Flight status' call in this scenario? The link you provided directs to the 'FIDS (airport departures and arrivals) - by relative time / by current time' call. I'm unsure where that one would display the type of aircraft.

But that approach could indeed give us a reliable approximation!

@dodedic
Copy link
Contributor

dodedic commented May 16, 2024

Actually it is the 'FIDS (airport departures and arrivals) - by relative time / by current time' call!
It gives this response, with 473 departures in this case. Under "airport" is the destination from LSZH and under "aircraft" is the type.

image

@dodedic
Copy link
Contributor

dodedic commented May 16, 2024

If we then use this call. This would give us the exact number of seats on that flight. This is a Tier 1 call, of which we have 200'000. Now let's say we request data for this one aircraft with this registration and we store it as well. Then if we see that on a next flight it's again this aircraft, we simply use the stored value and not make a new API request. This would avoid us burning through too many API requests.

With the current worldwide airliner fleet size around 20'000-30'000 aircraft this should work out.

image

@arebe337
Copy link
Contributor

Alright, that sounds like a plan! I would be really happy if you could do this part:)
Currently, I'm working on preparing everything with the GDP sheet so we can seamlessly integrate it into your code for scaling

@dodedic
Copy link
Contributor

dodedic commented May 16, 2024

Ofcourse! I will do it gladly 😄
Just want to make sure the approach is all good with Michael first.

@dodedic
Copy link
Contributor

dodedic commented May 16, 2024

Possible issue I see:
So let's say I take one week in August, January and May to get a good spread of data for high and low season.
There might be flight routes in your file, that won't be flown in the 3 weeks that I select. This could affect smaller airports I would say. Do we see this as a big issue?

@arebe337
Copy link
Contributor

I agree, it shouldn't pose a major issue. However, we do need to consider our approach for handling such cases. We should definitely identify the month with the highest number of different connections and prioritize that month. But I also agree, taking into account different seasons could also be beneficial for a more comprehensive analysis.

@michaelweinold
Copy link
Member Author

@dodedic:

I will first make a list of all aircraft types and attach an average available seats to it. #24 (comment)

You can use the table I created with my last master's student:
https://github.com/sustainableaviation/Aircraft-Performance/blob/main/Databank.xlsx

...but as per your more recent comment #24 (comment), it seems that you can go though all active aircraft and just get the exact number of seats.

Actually it is the 'FIDS (airport departures and arrivals) - by relative time / by current time' call! #24 (comment)

So in this example, you have 473 departures, all(?) of which are from a single aircraft HB-JJK?

The API call I would use is this one. It's a Tier 3 request which we have 150'000 of still. In this request get the destination and the aircraft type (but crucially not the amount of flights...so the work done so far is still important). #24 (comment)

So the GetAirportFlightsRelative call returns what exactly? The aircraft registration for every aircraft departing the airport? What do you mean by "In this request get the destination and the aircraft type (but crucially not the amount of flights...so the work done so far is still important)."?

@dodedic
Copy link
Contributor

dodedic commented May 16, 2024

...but as per your more recent comment #24 (comment), it seems that you can go though all active aircraft and just get the exact number of seats.

Exactly, we get all the active aircraft.

So in this example, you have 473 departures, all(?) of which are from a single aircraft HB-JJK?

Actually we get all departures from LSZH in the selected timeframe, one of which happened to be the HB-JJK. All other flights are from other airlines and other aircraft.

So the GetAirportFlightsRelative call returns what exactly? The aircraft registration for every aircraft departing the airport? What do you mean by "In this request get the destination and the aircraft type (but crucially not the amount of flights...so the work done so far is still important)."?

It returns (among other information) the destination of each flight departing the airport within the specified timeframe, as well as the exact aircraft registration and type of aircraft as seen in this comment's screenshot. This means we can either take the general aircraft type and estimate via your compiled list. Or we make an API call with ADB as mentioned above in order to get the exact number of seats on that flight.

"In this request get the destination and the aircraft type (but crucially not the amount of flights...so the work done so far is still important)."?

This meant that the GetAirportFlightsRelative call crucially doesn't include the amount of average daily flights from an airport, so those Tier 3 calls we made so far using the Statistical API were still necessary. That was just a note.

@dodedic
Copy link
Contributor

dodedic commented May 16, 2024

I am currently updating the Mermaid diagram to visualize this process!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants