CMSC 818e Day 5

September 12, 2018

These are notes taken during CMSC 818e: Distributed And Cloud-Based Storage Systems. Course webpage and syllabus here.

Everything Old is New Again

September 11, 2018

How does Google work under the hood? I remember first reading about Google’s PageRank algorithm and just marveling at how innovative and elegant it was. I suppose that I have come to expect such greatness from a company with seemingly unlimited resources. For this reason, reading the Google File System paper came as a bit of a surprise to me. Although Ghemawat et al. frame GFS as radically different from prior implementations of file systems, I was more surprised by its resemblances to past examples than I was by its novelties. In this post I discuss my reactions from reading the paper.

CMSC 818e Day 4

September 10, 2018

These are notes taken during CMSC 818e: Distributed And Cloud-Based Storage Systems. Course webpage and syllabus here.

Optimizing cloud storage (but sacrificing privacy?)

September 09, 2018

What if we thought of files not as data but as recipes for creating data? In the log-structured file system paper, we were asked to think of files not as complete entities assigned some fixed place on disk, but as pieces that could be stored separately and later aggregated on demand. In “Knockoff: Cheap versions in the cloud,”, Dou et al. present a new approach; a file system that examines how files are constructed, and uses that information to decide whether to send and store a file in the tradition way (i.e. represented as its data), or as “recipes,” logs of all the system calls, operations, mouse movements and clicks that went into producing the file.

The Elephant file system

September 09, 2018

I was recently reading Creativity, Inc by Ed Catmull and Amy Wallace, which is about the genesis of Pixar, a company that was made by a single movie (Toy Story). What most people don’t know is that the company was very nearly unmade when, a year into the production of Toy Story 2, the entire movie was accidentally deleted from the file system where it was stored. Tldr; everything turns out ok, but it got me thinking – how do our file systems protect us from ourselves? In “Deciding when to forget in the Elephant file system,” Douglas Santry, et al described their development and experimentation with a new kind of filesystem that, by default, refuses to forget.