CMSC 818e Day 2

Reading time ~1 minute

These are notes taken during CMSC 818e: Distributed And Cloud-Based Storage Systems. Course webpage and syllabus here.

Day Two

Discussion of Immutability Changes Everything

  • Pat Helland’s “Immutability Changes Everything”.
  • fight ambiguity with append-only applications
  • use content-defined names (e.g. hash of the file)
  • Amol’s DataHub - GitHub for datasets - how to store them efficiently? Each dataset is treated as immutable; might be new versions but the datasets themselves don’t change.
  • Deal with massively parallel “big data” with MapReduce; depends on immutable data – doesn’t matter when the actual computation on a subslice happens, because it won’t change.
  • Append-only computing: logs are the truth (log is immutable); single-master changes are applied sequentially via single-master or consensus.
  • Immutable is not always immutable:
    • optimizing for read access: indexes, de-normalization
    • farming out portions of work, with re-try
    • tension between fast access (tiny tables) and expense of joins
    • normalization is there to eliminate update anomalies.
  • Immutability enables unambiguous identity (content-defined names)
  • Immutability enables massive replication/caching/parallelism
  • Immutability eliminates locking
  • Immutability enables re-computation
    • from immutable data
    • of immutable data

Project 1

  • Build a file system in Go. Will be completely in-memory. Implemented as in-memory tree specified by root.
  • gitlab link
  • Use Fuse - an interface for building user-level file systems.
  • modify dfs.go per README; don’t have to implement all of the functions at once. Start with ‘hello’ and slowly add in others
  • bazil can have a file system that doesn’t crash with a very small subset of the methods defined.
  • use Piazza to communicate with class & Pete

A Parrot Trainer Eats Crow

In this post, we'll consider how it is that models trained on massive datasets using millions of parameters can be both "low bias" and al...… Continue reading

Embedded Binaries for Go

Published on February 06, 2021