MiDAS Depth Exam

2024/25 Depth Exam is (tentatively) scheduled for January 23rd and 24th, 2025.

2024/25 Depth Exam papers are available below (more to come)!

Overview: MiDAS depth exam is a take home exam that covers the main areas of the group: (i) data management and (ii) data mining. These of course may span several subareas of the fields including data systems, indexing, query processing, algorithmic data mining, graph mining, etc. Note that the department requires that the depth exam is completed on a certain schedule (see www.bu.edu/cs/phd-program/phd-program-milestones).

Eligibility: In order for a PhD student to be eligible for the MiDAS depth exam they should (i) be advised or co-advised by a MiDAS faculty member, (ii) in their 2nd or 3rd year, and (iii) have discussed with their advisor about taking the depth exam.

Process: For every iteration of the depth exam MiDAS faculty will provide a list of ~24 papers that the candidates will have about two months to study. The exam will take place at a predefined date announced in this page. The exam will contain four (4) subjects, two (2) for each of the main areas. All four subjects will have the nature of an open research challenge rather than one of an exercise to solve, and each candidate will select to address any three (3) of these subjects. A depth exam will be evaluated on (a) the understanding of the research area, (b) the creativity, and ultimately (c) the correctness and the quality of the proposed approach. The take home exam will be returned in two (2) days.

Data Management

ZNS: Avoiding the Block Interface Tax for Flash-based SSDs

Matias Bjørling, Abutalib Aghayev, Hans Holmberg, Aravind Ramesh, Damien Le Moal, Gregory R. Ganger, George Amvrosiadis
USENIX ATC, 2021 | Download PDF

LSM-based storage techniques: a survey

Chen Luo, Michael J. Carey
VLDB, 2020 | Download PDF

Survey of vector database management systems

James Jie Pan, Jianguo Wang, Guoliang Li
The VLDB Journal, 2024 | Download PDF

SingleStore-V: An Integrated Vector Database System in SingleStore

Cheng Chen, Chenzhe Jin, Yunan Zhang, Sasha Podolsky, Chun Wu, Szu-Po Wang, Eric Hanson, Zhou Sun, Robert Walzer, Jianguo Wang
PVLDB, 2024 | Download PDF

WALTZ: Leveraging Zone Append to Tighten the Tail Latency of LSM Tree on ZNS SSD

Jongsung Lee, Donguk Kim, Jae W. Lee
PVLDB, 2023 | Download PDF

CXL and the Return of Scale-Up Database Engines

Alberto Lerner, Gustavo Alonso
PVLDB, 2024 | Download PDF

Data Mining

Attention is All You Need

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin
NeurIPS, 2017 | Download PDF

node2vec: Scalable Feature Learning for Networks

Aditya Grover, Jure Leskovec
KDD, 2016 | Download PDF

Maximizing a Monotone Submodular Function Subject to a Matroid Constraint

Gruia Calinescu, Chandra Chekuri, Martin Pál, Jan Vondrák
SIAM Journal of Computing, 2011 | Download PDF

The Generalized Mean Densest Subgraph Problem

Nate Veldt, Austin R. Benson, Jon Kleinberg
KDD, 2021 | Download PDF

Faster Linear Algebra for Distance Matrices

Piotr Indyk, Sandeep Silwal
NeurIPS, 2022 | Download PDF

Is Cosine-Similarity of Embeddings Really About Similarity?

Harald Steck, Chaitanya Ekanadham, Nathan Kallus
WWW, 2024 | Download PDF