News

17 May 2021 Thanks to all the teams that participated this year and congratulations to the five finalists!
02 May 2021 Huawei joins Microsoft and SequoiaDB as sponsor of the contest!
30 April 2021 Dear participants, many of you have tried Snowman, the tool for exploring/debugging entity matching results. We kindly ask a few minutes of your time to provide feedback on it through this form. That would significantly help the group of HPI students to improve the tool and guide future improvement of the opensource tool. Good luck with the final hours of the contest!
24 April 2021 The problem has been solved and the evaluation process restarted. Considering this issue and the slowness of the queue during the latest days, the deadline has been moved to 30 April 2021 (CET).
24 April 2021 The supply of electricity to the evaluation server has been temporarily interrupted due to a technical problem in the surrounding area. We will let you know as soon as it is back and extend the deadline accordingly.
11 April 2021 A member of each active team is kindly requested to fill this form to notify the composition of the team (remember that a participant can be member of only one team).
In order to prevent the usage of multiple teams to perform submissions, the solutions uploaded by teams that do not appear in this form are no longer evaluated.
08 April 2021 The new datasets are now included in the latest release of Snowman, which also adds several new features to the software.
07 April 2021 NotebookLarge and Altosight dataset are available!
The submissions are now evaluated on the three official datasets.
06 April 2021 We are getting ready for the release, the evaluation system will be back tomorrow... together with two new datasets!
30 March 2021 Microsoft and SequoiaDB are the sponsors of the contest!
Awards are published in Contest Overview section.
23 March 2021 In the dedicated Evaluation Process section, you can now see the ReproUnzip commands used for the evaluation.
Hope they can help you!
10 March 2021 The second phase of the contest is ready to start!
Solutions are now evaluated on the official Notebook dataset.
09 March 2021 The first official dataset (Notebook) is now available!
You can also download it together with Snowman.
03 March 2021 The first official dataset will be published on 09 March 2021.
25 February 2021 Team registration and solution submission are now open!
Please check the dedicated Submitting section in the Task page.
18 February 2021 The new contest page is up!
Please check the task and discover the toy dataset!

Student teams from degree-granting institutions are invited to compete in the annual SIGMOD Programming Contest. This year, the subject of the contest is to construct an Entity Resolution system. Teams' submissions will be judged on their performance on a set of supplied datasets.

The winning team will be awarded a prize of $7,000 (USD), and there will be an additional prize of $3,000 (USD) for the runner-up.
Prize money is donated by Microsoft, SequoiaDB, and Huawei.

This year's contest is brought to you by the DBGroup at the University of Modena and Reggio Emilia and by the Database Research Group at the Roma Tre University. The organizing team is made up of Donatella Firmani (co-chair), Giovanni Simonini (co-chair), Andrea De Angelis, and Luca Zecchini.

Task Overview

For this year's contest, the task is Entity Resolution. Entity Resolution (ER) is the problem of identifying and matching different manifestations of the same real-world entity in a dataset.

For this task, you need to identify which instances represent the same real-world object. You are asked to solve Entity Resolution on several datasets, each one containing a different type of object (e.g., people, products, movies, etc.) and different distributions of data and noise. The challenge is to develop an Entity Resolution system for matching the instances representing the same real-world object with high precision and recall, over all the datasets.

More details about this year's problem can be found on the Task page.

Important Dates

18 February 2021 New site up. Contest requirements specification and toy dataset available.
25 February 2021 Team registration begins. Leaderboard available.
09 March 2021 First official dataset (Notebook) available together with Snowman.
07 April 2021 NotebookLarge and Altosight datasets available.
30 April 2021
(CET)
Final submission deadline.
17 May 2021 Finalists notified.
20-25 June 2021 ACM SIGMOD/PODS 2021 Conference.

Sponsors

Microsoft logo     SequoiaDB logo       Huawei logo

Contacts

Ask questions and stay up to date by joining the ACM SIGMOD 2021 Programming Contest Google Group or contact us at sigmod21contest@gmail.com.