Jfjelstul Worldcup Data-csv Appearances May 2026
Using the appearances table, you must calculate time_played = (substitute_out - substitute_in) for each row. For players who played the full 90 (or 120), the logic is different.
import pandas as pd appearances = pd.read_csv('https://raw.githubusercontent.com/jfjelstul/worldcup/master/data-csv/appearances.csv') goals = pd.read_csv('https://raw.githubusercontent.com/jfjelstul/worldcup/master/data-csv/goals.csv') Filter for substitutes (game_started = FALSE) subs = appearances[appearances['game_started'] == False] Merge with goals to count goals by sub appearances sub_goals = goals.merge(subs, on=['match_id', 'player_id']) sub_goals_count = sub_goals.groupby('player_name_x').size().reset_index(name='goals') sub_goals_count.sort_values('goals', ascending=False).head(10) jfjelstul worldcup data-csv appearances
At first glance, it is merely a log of who played when. But look closer. This table is the structural engineering of football history. It tells you not just who won, but who endured. It captures the 89th-minute substitutions, the yellow card accumulation, the captains who played every second of extra time, and the reserves who never saw the pitch. Using the appearances table, you must calculate time_played
SELECT player_name, team, SUM(minutes_played) as total_minutes FROM appearances WHERE tournament = '2022' GROUP BY player_id ORDER BY total_minutes DESC Goalkeepers and center-backs from finalists dominate. In 2022, Emiliano Martínez (Argentina) or Hugo Lloris (France) would top the list with ~690+ minutes. But the real magic is historical: In 2014, Manuel Neuer played every single minute of Germany’s run, including the final. 3. The Tactical Insight: Substitution Dynamics Over Time The substitute_in and substitute_out columns allow you to map the evolution of tactics. Before 1970, substitutions were practically non-existent (injury only). By 2022, five substitutions were allowed. But look closer
In the ecosystem of sports data science, few repositories are as meticulously maintained or as democratically accessible as Joshua Fjelstul’s jfjelstul/worldcup database. While the goals.csv file gets the glory and the matches.csv file provides the narrative spine, there is one table that captures the raw, human cost of the World Cup: appearances.csv .
Calculate the average minute of the first substitution per decade.
