On our last meetup we decided to vote for the competitions that will be solved in the next Coding Session and today I will announce the winners: Rossmann Store Sales and Right Whale Recognition. Both competitions are equally interesting and challenging!
Rossmann Store Sales
In the Rossman Store Sales competition we need to forecast sales using store, promotion, and competitor data. One of the challenges in this competition is to transform the store data into a robust numeric model.
The store data looks like this:
Store | Store Type |
Assortment | Competition Distance |
Competition OpenSinceMonth |
Competition OpenSinceYear |
Promo2 | Promo2 SinceWeek |
Promo2 SinceYear |
Promo Interval |
---|---|---|---|---|---|---|---|---|---|
1 | c | a | 1270 | 9 | 2008 | 0 | |||
2 | a | a | 570 | 11 | 2007 | 1 | 13 | 2010 | Jan,Apr,Jul,Oct |
The training data looks like this:
Store | DayOfWeek | Date | Sales | Customers | Open | Promo | StateHoliday | SchoolHoliday |
---|---|---|---|---|---|---|---|---|
1 | 5 | 2015-07-31 | 5263 | 555 | 1 | 1 | “0” | “1” |
2 | 5 | 2015-07-31 | 6064 | 625 | 1 | 1 | “0” | “1” |
Christian Thiele did an amazing analysis of the provided data in R, you should check it out before getting started. Paul Shearer created an interactive visualization of the sales data using dygraph. You can find more information and insights about the 2 scripts on the scripts of the week section on Kaggle.
Right Whale Recognition
In the Right Whale Recognition competition we need to automate the recognition process using a dataset of arial photographs of individual whales and identify North Atlantic Right Whales. One of the challenge in this competition is to extract descriptive features out of the training images which have been hand-labeled by experts. To better identify the whales, we have to understand the Right Whale Callosity Patterns. I found this little game to get started.
Mathworks is providing a Complimentary Software for MATLAB for this competition; however, in the Coding Session we will only use Open Source Software.