by Brian Northan
Brian with wife Jessica Hageman Northan
For the last few years Ed Hampston, Jon Rocco, and I have been scoring the Grand Prix. Before that Jim Moore had done it for many years. In 2013 Jim asked me to help him automate the Grand Prix scoring. Jim had been doing it manually in Excel and it took him up to 2 days to score large races.
I work as a contract image analysis programmer. One of my main tools was (and is) an open source image analysis platform, for biologists, called ImageJ. Thus I was working mostly in Java at the time. So the first implementation of the scoring engine (which I called score-ware) was in Java. Click here to see it. Note that the first version is now out of service. It used the wrong technologies, and was replaced by a new version.
As I developed the scoring engine, Jim and I both scored the races for two years. The main technical challenge was to match runners in race results, with runners in the membership database under imperfect conditions. The imperfect conditions were spelling mistakes, name changes, nicknames, address changes, and missing information.
We would compare results and I'd tweak the scoring engine to replicate Jim’s thinking. Over the years the scoring engine got better, the heart of it was a “Fuzzy logic Levenshtein” matcher that was fairly robust, though still made some mistakes.
I also started to realize that Java was not the best technology for data analysis. I decided to convert the app to Python so I could take advantage of its rich ecosystem of data analysis tools. The current version of the scorer is here. It's called the score-ware site because in the future it could be integrated with a Python web toolkit called Django. I have not released a web version yet, but may do so in the future.
Scoring the Grand Prix still requires some manual work. The scoring engine is set to ask a human (me) to confirm matches below a certain threshold. Once the scoring is done Ed Hampston and Jon Rocco check the results. They often find mistakes and one of us fixes them. Then Ed posts the results to the web page, which is implemented in Concrete5.
It would be nice to read our membership database directly, write the results to it directly, and have them automatically posted. This may happen in the future but that will depend partly on how our other web technologies are implemented and whether they can easily interface with Python.
If you have an IT background and are interested in helping HMRRC with their web technology, let me know, firstname.lastname@example.org.
The next Grand Prix races are Runnin' of the Green, March 9, followed in April by the Delmar Dash. Come sign up and get some Grand Prix points.
Brian in Boston Marathon shirt