Large Scale Data Integration Project (LSDIPro) WiSe2024/2025

In this course the students will develop solutions for large scale data integration. Working in groups of up to 4 students, the goal is to reproduce an existing research prototype and enhance it with their own ideas. Each group is accompanied by a mentor from the D2IP group to report and capture progress. The students will learn to implement scalable algorithms, evaluate them systematically, and to read and interpret technical papers and to critically judge experimental results. At the same time, students will learn to deal with data heterogeneity problems at scale.


Content

  • Selection of a project and building a team
  • Discussion rounds on design, implementation, tests, and experiments.
  • Prototype implementation, tests, and experiments.
  • 15min oral presentation of the created prototype.

Deliverables

  • Code in the form of a Git repository
  • Final Presentation
  • Individual contribution sheet (single A4 Page)


Schedule (Thursday 10:00 AM, E-N 719)

Capacity:

  • 17.10.2024: Topic selection, group formation (Introduction Slides)
  • 24.10.2024: Weekly meetings: Identify the main idea, the problem being solved, who/what is involved, understand paper, got familiar with datasets and repositories (Notebook for BTW Challenge)
  • 31.10.2024: Weekly meetings (development): plan discussion
  • 07.11.2024: Weekly meetings (development)
  • 14.11.2024: Weekly meetings (development): first pipeline of the software should be ready
  • 21.11.2024: Weekly meetings (development)
  • 28.11.2024: Weekly meetings (development)
  • 05.12.2024: Expert review (A different group will test your system)
  • 12.12.2024: System improvement, experiments
  • 19.12.2024: Weekly meetings (experiments)
  • 09.01.2025: Weekly meetings (experiments), Start writing the report
  • 16.01.2025: Weekly meetings (experiments, visualization)
  • 23.01.2025: Weekly meetings (experiments, visualization)
  • 30.01.2025: Weekly meetings: Wrap up documentation, and presentation slides
  • 06.02.2025: Final presentations
  • 13.02.2025: Final presentations
  • 14.02.2025: Contribution sheet submission deadline

  • Organization

    • Lecturer: Prof. Dr. Ziawasch Abedjan, D2IP
    • Teaching Assistant: Dr. Binger Chen, D2IP
    • Teaching Assistant: Muaid Mughrabi, D2IP
    • Grading: passed ≥ 40% points
    • Meetings: Thursday 10 - 12:00, E-N 719