With the 2016 Olympic Games fast approaching, I decided to make an app for my new Pebble Time watch that would allow me to keep up with the medal count for each country as the games progress. This idea seemed simple at first, but led me through some interesting puzzles that I did not expect.
First I needed a source for the data. The International Olympic Committee (IOC) provides a data feed for the Rio Olympics here: http://odf.olympictech.org/2016-Rio/rio_2016_OG.htm but it lists all of the events separately and I would need to sift through gigabytes of data for a simple count of how many medals each country has won. Other data feed websites exposed APIs with this data, but they were only available for purchase. As this is just a side project, I needed to get this information free if possible.
I found that NBC exposes exactly the information that I was looking for on their website here: http://www.nbcolympics.com/medals which just left me with one question – how do I get it programatically?
The answer for that was provided by Mason Hensley (GitHub) who suggested that I use a site called ScrapingHub. This site allows you to setup a “spider” that will crawl a website and pick out elements that you want to keep. For now, the spider that I setup crawls the NBC site for 2012 results because they do not have any elements on their website yet for the 2016 games. For each table of results on this page, there’s a separate table for mobile viewing that is hidden when using the desktop site. I found that the mobile table was a much more direct way to get at the data as there were less graphics, links, and other html elements in the way.
The code for the spider is written in Python and uploaded to ScrapingHub using the cli tool Shub. I set it up to run every five minutes so that the data will never be too old, and ScrapingHub exposes it behind an API that requires an ApiKey and ProjectID for authentication. The spider code mainly consists of xml paths that lead to the information that I want to gather.
import scrapy from medalcount.items import MedalCountItem class MedalCountSpider(scrapy.Spider): name = "counts" allowed_domains = ["nbcolympics.com"] start_urls = ["http://www.nbcolympics.com/medals"] def parse(self, response): for sel in response.xpath('//div[@class="standings=2012"]//table[@class="tab-grid-table-mobile grid-table-mobile"]/tbody/tr'): item = MedalCountItem() item['place'] = sel.xpath('td[@class="place"]/div/text()').extract()[0] item['country'] = sel.xpath('td[@class="country"]/div/a/img/@alt').extract()[0] item['gold'] = sel.xpath('td[@class="country"]/div/ul/li[@class="gold"]/text()').extract()[0] item['silver'] = sel.xpath('td[@class="country"]/div/ul/li[@class="silver"]/text()').extract()[0] item['bronze'] = sel.xpath('td[@class="country"]/div/ul/li[@class="bronze"]/text()').extract()[0] item['total'] = sel.xpath('td[@class="country"]/div/ul/li[@class="total-block to-right"]/text()').re('(\d+)')[0] yield item
Which produces these results from the API:
[{ "_type":"MedalCountItem", "place":"1", "country":"United States", "gold":"46", "silver":"28", "bronze":"29", "total":"103" },{ "_type":"MedalCountItem", "place":"2", "country":"China", "gold":"38", "silver":"29", "bronze":"21", "total":"88" }...]
Now all that was left was to make the Pebble app that I had in mind from the start. I decided to use Pebble.js since I had written my last project using the C language and wanted to explore a new way to build Pebble apps. All that is involved in the app is showing an image while the API call takes place in the background, and displaying the results. The Pebble framework provides several types of UI components, and I found that the Menu componenet was the best fit for the way I wanted to display the results both on the square and round Pebble watches.
It was a very smooth process to get the app up on the Pebble App Store, just requiring some screenshots, promotional images, a description, and the compiled project.
The app is available in the Pebble App Store as Olympic Medal Count, and is my first published app for any mobile device.