I've been playing Trackmania, a racing game, recently and they introduced a new concept called Cup of the Day. Every day a brand new map is released, for 15 minutes everyone is trying to get the fastest time and based on that time are put into groups of 64 players. Then for 23 rounds people play and the slowest ones are eliminated until there's only 1 winner left. This is super fun to play!
Each map has 4 times associated: "Author Time", "Gold Medal", "Silver Medal", "Bronze Medal". Not all those times are of equivalent difficulty for all the maps. I've been trying to get gold medals on all the tracks and for some it takes me a few minutes compared to hours for some others. So I've been trying to figure out a way to get a sense of how difficult it is to get.
Getting the data
Fortunately, there's an in-game leaderboard that tells you the times everyone made. So my plan was to scrape this leaderboard, figure out how many people got which medal and hope to get a sense of how hard the map is.
Fortunately, the website trackmania.io has all the information I needed. It has the medal times and the actual leaderboard.
Not only that but the way the website is written is a single page app using Vue that queries the data from a server endpoint using JSON. So this makes retrieving the information even more straightforward, no need to parse HTML.
At this point, what I need is to figure out how to get the number of people that got each medal. The traditional way to do a leaderboard is to have the endpoint return a fixed number of results each time and have a pagination system. So I would do a binary search in order to find where the medal boundary lie.
But, it turns out that this was even easier, the pagination API is doing the limits based on a specific time. So the algorithm was to query the map medal times, and for each of them query the leaderboard and take the position of the first result to know how many people had the previous medal!
For example, Gold time is 1:03.000. I query the leaderboard starting at 1:03, the first person will have 1:03.012 and be at position 2310. They won't have the gold medal, only silver. But 2309 people will have either the author time or gold medal.
Coding the scraper
I decided to go with a nodejs script this time around. But you can use whatever language you want, I've been doing a lot of scraping in PHP in the past.
The first thing you want to do is to add a caching layer so you don't spam the server and get you banned, but also make it much quicker to iterate as the next times it'll be instant. Here's a quick & dirty way to build caching:
async function fetchCachedJSON(url) { const key = url.replace(/[^a-zA-Z0-9]/g, '-').replace(/[-]+/g, '-'); const cachePath = `cache/${key}.json`; if (fs.existsSync(cachePath)) { return JSON.parse(fs.readFileSync(cachePath)); } const json = await fetchJSON(url); fs.writeFileSync(cachePath, JSON.stringify(json)); return json; } |
And this is what my cache/
folder looks like after it's been running for a while.
The great aspect about this is that all those are single files that can be looked at manually and edited if needs be. If you someone messed up or got banned, you can delete the specific files and retry later.
If you look at the code, you'll notice that I didn't use the fetch API directly but instead used a fetchJSON
function. The reason for this is that you'll most likely want to do some special things.
You probably will need some sort of custom headers for authentication or mime type. It's also a good place to add a sleep so you don't spam the server too heavily and get banned.
async function fetchJSON(url) { const response = await fetch(url, { headers: { 'Authorization': 'nadeo_v1 t=' + accessToken, 'Accept': 'application/json', 'Content-Type': 'application/json', 'User-Agent': 'vjeux-totd-medal-ranks', } }); const json = await response.json(); await sleep(5000); return json; } function sleep(ms) { return new Promise(resolve => setTimeout(resolve, ms)); } |
After this, the logic was pretty straightforward, where I would just do the algorithm I described at the beginning. In order to write it, the usual way is to first implement the deepest part and test it standalone (fetching a single time) then wrap it for a map, then for a month, then for all the months.
async function fetchRankFromTime(trackID, time) { const json = await fetchCachedJSON('https://trackmania.io/api/leaderboard/map/' + trackID + '?from=' + time); return json.tops[0].position - 1; } async function fetchRanks(trackID) { const map = await fetchCachedJSON('https://trackmania.io/api/map/' + trackID); const rankAT = await fetchRankFromTime(trackID, map.authorScore); const rankGold = await fetchRankFromTime(trackID, map.goldScore); const rankSilver = await fetchRankFromTime(trackID, map.silverScore); const rankBronze = await fetchRankFromTime(trackID, map.bronzeScore); return [map.authorplayer.name, rankAT, rankGold, rankSilver, rankBronze]; } async function fetchTOTDMonth(month) { const json = await fetchCachedJSON('https://trackmania.io/api/totd/' + month); const days = []; for (jsonDay of json.days) { [authorName, rankAT, rankGold, rankSilver, rankBronze] = await fetchRanks(jsonDay.map.mapUid); days.push({day: jsonDay.monthday, month: json.month, year: json.year, authorName, rankAT, rankGold, rankSilver, rankBronze}); } return days; } async function fetchAll() { const days = []; for (month of [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]) { days.push(...await fetchTOTDMonth(month)); } console.log('data =', JSON.stringify(days.sort((a, b) => b.rankAT - a.rankAT))); } |
There are two distinct parts of the project. The first part is the data collection, described above. The second is how do you display all this data. I like to keep both separate.
In this case, the end result of the collection is a standalone JSON document where I prepend data =
, that I can then include as a <script>
. This way in my front-end, also written in Vue for this project, I can use that global data variable to access it.
Displaying the data
I didn't really know how to display the data. We have the number of people that have all the medals before. I started displaying an horizontal bar for each where 1px = 1 position. It worked pretty well but it was way too large.
The Trackmania API stops giving any kind of precision past 10,000 so I used CSS and made all the numbers where 100% is 10,000 and it gave this results which worked well!
Since most tracks are finished by around 8k players it turns out to be working really well in practice.
Now that we have that for every single map, we can start having fun and sort this data in many ways.
Sorting by number of people that got author time gives us what I was looking when going into this project. We can find the easiest maps:
As well as the hardest maps.
I've implemented various ways to sort such as date released, medal times and map author name. In doing so, I found out that if 10k people got a medal then the sort is going to give different orderings every time you sort them.
A quick and dirty fix is to sort the data by all the previous pivots so that it always give a stable list. It is wasteful but easy to implement by copy and pasting and the dataset is small enough that it doesn't matter much in practice.
sortAuthor: function() { this.days.sort((a, b) => (b.year * 10000 + b.month * 100 + b.day) - (a.year * 10000 + a.month * 100 + a.day)); this.days.sort((a, b) => b.rankAT - a.rankAT); this.days.sort((a, b) => b.rankGold - a.rankGold); this.days.sort((a, b) => b.rankSilver - a.rankSilver); this.days.sort((a, b) => b.rankBronze - a.rankBronze); this.days.sort((a, b) => a.authorName.localeCompare(b.authorName)); |
Conclusion
This was a fun project and I'm happy that I was able to figure out how hard a map was in practice. I'd like to give big props to Miss who built all the Trackmania APIs I used during this project.