In this course, the students will develop solutions for large scale data integration. Working in groups of up to 4 students, the goal is to reproduce an existing research prototype starting from the related paper and enhance it with their own ideas. All groups are accompanied by a mentor from the D2IP group to report and capture progress. The students will learn to implement scalable algorithms, evaluate them systematically, read and interpret technical papers, and critically judge experimental results. At the same time, students will learn to deal with data heterogeneity problems at scale.