Geoffrey Yu

Geoffrey Yu

Computer Science PhD Student

Massachusetts Institute of Technology (MIT)

About Me

I am a Computer Science PhD student at MIT, advised by Professor Tim Kraska. I am part of the Data Systems Group within the Computer Science and Artificial Intelligence Laboratory (CSAIL).

I am generally interested in computer systems research. My research interests span data systems, distributed systems, and systems for machine learning. I also enjoy thinking about problems at the intersection of systems and human-computer interaction as I strongly believe in the value of creating usable systems software.

Before starting my PhD, I earned my master's degree in Computer Science at the University of Toronto, advised by Professor Gennady Pekhimenko. Before graduate school, I was a Software Engineering student at the University of Waterloo and graduated in 2018.

News

August 29, 2024
At VLDB 2024 in Guangzhou, I presented our full paper on BRAD: a system that virtualizes and automatically optimizes cloud data infrastructures.
August 28, 2023
At VLDB 2023 in Vancouver, I presented our group's vision for BRAD and jointly presented TreeLine.
July 1, 2023
Our vision for a new unified cloud data processing system called BRAD was accepted to appear at VLDB 2023!
See older news »

Publications

Blueprinting the Cloud: Unifying and Automatically Optimizing Cloud Data Infrastructures with BRAD

Blueprinting the Cloud: Unifying and Automatically Optimizing Cloud Data Infrastructures with BRAD

Geoffrey X. Yu, Ziniu Wu, Ferdi Kossmann, Tianyu Li, Markos Markakis, Amadou Ngom, Samuel Madden, Tim Kraska.

Proceedings of the VLDB Endowment (VLDB), Vol. 17, No. 11., 2024.

@article{brad-yu24,
  author = {Yu, Geoffrey X. and Wu, Ziniu and Kossmann, Ferdi and Li, Tianyu 
    and Markakis, Markos and Ngom, Amadou and Madden, Samuel and Kraska, Tim},
  doi = {10.14778/3681954.3682026},
  journal = {{Proceedings of the VLDB Endowment}},
  month = {8},
  number = {11},
  pages = {3629–-3643},
  title = {{Blueprinting the Cloud: Unifying and Automatically Optimizing 
    Cloud Data Infrastructures with BRAD}},
  volume = {17},
  year = {2024},
}
Check Out the Big Brain on BRAD: Simplifying Cloud Data Processing with Learned Automated Data Meshes

Check Out the Big Brain on BRAD: Simplifying Cloud Data Processing with Learned Automated Data Meshes

Tim Kraska*, Tianyu Li*, Samuel Madden*, Markos Markakis*, Amadou Ngom*, Ziniu Wu*, Geoffrey X. Yu*.

Proceedings of the VLDB Endowment (VLDB), Vol. 16, No. 11., 2023. Vision Paper.

@article{brad-kraska23,
  author = {Kraska, Tim and Li, Tianyu and Madden, Samuel and Markakis, Markos
    and Ngom, Amadou and Wu, Ziniu and Yu, Geoffrey X.},
  doi = {10.14778/3611479.3611526},
  journal = {{Proceedings of the VLDB Endowment}},
  month = {8},
  number = {11},
  pages = {3293--3301},
  title = {{Check Out the Big Brain on BRAD: Simplifying Cloud Data Processing
    with Learned Automated Data Meshes}},
  volume = {16},
  year = {2023},
}
TreeLine: An Update-In-Place Key-Value Store for Modern Storage

TreeLine: An Update-In-Place Key-Value Store for Modern Storage

Geoffrey X. Yu*, Markos Markakis*, Andreas Kipf*, Per-Åke Larson, Umar Farooq Minhas, Tim Kraska.

Proceedings of the VLDB Endowment (VLDB), Vol. 16, No. 1., 2022.

@article{treeline-yu23,
  author = {Yu, Geoffrey X. and Markakis, Markos and Kipf, Andreas and
    Larson, Per-Åke and Minhas, Umar Farooq and Kraska, Tim},
  doi = {10.14778/3561261.3561270},
  journal = {{Proceedings of the VLDB Endowment}},
  month = {9},
  number = {1},
  pages = {99--112},
  title = {{TreeLine: An Update-In-Place Key-Value Store for Modern Storage}},
  volume = {16},
  year = {2022},
}
Habitat: A Runtime-Based Computational Performance Predictor for Deep Neural Network Training

Habitat: A Runtime-Based Computational Performance Predictor for Deep Neural Network Training

Geoffrey X. Yu, Yubo Gao, Pavel Golikov, Gennady Pekhimenko.

USENIX Annual Technical Conference (USENIX ATC), 2021.

@inproceedings{habitat-yu21,
  title = {{Habitat: A Runtime-Based Computational Performance Predictor
    for Deep Neural Network Training}},
  author = {Yu, Geoffrey X. and Gao, Yubo and Golikov, Pavel and
    Pekhimenko, Gennady},
  booktitle = {{Proceedings of the 2021 USENIX Annual Technical Conference
    (USENIX ATC'21)}},
  year = {2021},
}
Skyline: Interactive In-Editor Computational Performance Profiling for Deep Neural Network Training

Skyline: Interactive In-Editor Computational Performance Profiling for Deep Neural Network Training

Geoffrey X. Yu, Tovi Grossman, Gennady Pekhimenko.

ACM Symposium on User Interface Software and Technology (UIST), 2020.

@inproceedings{skyline-yu20,
  title = {{Skyline: Interactive In-Editor Computational Performance Profiling
    for Deep Neural Network Training}},
  author = {Yu, Geoffrey X. and Grossman, Tovi and Pekhimenko, Gennady},
  booktitle = {{Proceedings of the 33rd ACM Symposium on User Interface
    Software and Technology (UIST'20)}},
  year = {2020},
}

* Denotes equal contribution.

Demonstrations

Skyline: Interactive In-Editor Performance Visualizations and Debugging for DNN Training

Geoffrey X. Yu, Tovi Grossman, Gennady Pekhimenko.

Machine Learning and Systems (MLSys), 2020. Demonstration Track, Non-archival.

TBD Suite: Benchmarking and Profiling Tools for DNNs

Geoffrey X. Yu, Hongyu Zhu, Anand Jayarajan, Bojian Zheng, Abhishek Tiwari, Gennady Pekhimenko.

Machine Learning and Systems (MLSys), 2019. Demonstration Track, Non-archival.