A Parrot Trainer Eats Crow

March 10, 2021

In this post, we’ll consider how it is that models trained on massive datasets using millions of parameters can be both “low bias” and also very biased, and begin to think through what we in the ML community might be able to do about it.

Why don't developers give much cred to low-code/no-code tools?

February 24, 2021

Recently a good friend asked me why it is that developers don’t give much credence to low-code and no-code tools. It’s an interesting question! One problem is that “low code” is a relative term — technically something like D3 is a low-code solution for building interactive data viz. Bootstrap is a low-code tool for making webpages. Keras is a low-code way for me to template out my Tensorflow models. And, if Kubernetes is mostly just yaml, does that mean it’s “no code”? Nevertheless, the words “low code” and “no code” do make me cringe. In this post, I try to tease out exactly why.

Embedded Binaries for Go

February 06, 2021

Recently a friend asked about Go packages for embedding binary data into code… As it turns out, there are a lot of options!

Tailored Learning

December 08, 2020

In this series on multilingual models, we’ll construct a pipeline that leverages a transfer learning model to train a text classifier using text in one language, and then apply that trained model to predictions for text in another language. In our first post, we considered the rise of small data and the role of transfer learning in delegated literacy. In this post, we’ll prepare our domain-specific training corpus and construct our tailored learning pipeline using Multilingual BERT.

The Rise of Small Data (aka Delegated Literacy)

December 07, 2020

In this series on multilingual models, we’ll construct a pipeline that leverages a transfer learning model to train a text classifier using text in one language, and then apply that trained model to predictions for text in another language. In this first post, we’ll look at the trajectory in expectations about data size over the last two decades, and talk about how that informs the model architecture for our text classifier.