A System Health Check in Go

August 19, 2018

Recently I was talking to a friend who works in the distributed computing space. He is doing research on how messages travel between geographically distributed servers, and his experiments rely on those servers being active and healthy (if they’re not, the experiment may actually run but produce no usable results, wasting my friend’s valuable time!). So I wondered if I could write a little script in Go that could do a simple machine health status report.

Words in Space

August 01, 2018

Text analysis tasks like vectorization, document classification, and topic modeling require us to define what it means for any two documents to be similar. There are a number of different measures you can use to determine similarity for text (e.g. Minkowski, Cosine, Levenstein, Jaccard, etc.). In this post, we’ll take a look a couple of these ways of measuring distance between documents in a specific corpus, and then visualize the result of our decisions on the distribution of the data using Yellowbrick.

Getting Started with Go

July 12, 2018

Over the last year, I’ve been slowly teaching myself Go. As a Python programmer who previously studied Java and Perl (veeeery briefly), I wanted to know what learning Go would teach me about programming and problem solving in general (à la the Blub Paradox). However, as I’ve continued learning Go, I’ve become increasingly curious about some of the unique features of the language (e.g. automatic memory management and Goroutines), and how they might help me to solve classes of problems I haven’t yet encountered. But… before we get to all that, this is just a quick post that goes over some of the basics of getting started with Go, including notes on installation, the go tool, and workspaces.

SPARQL Queries for Local RDF Data

July 05, 2018

So you found some RDF data (yay!) from an archive somewhere, but there’s no active SPARQL endpoint (it’s all JSON APIs these days). You could use the methods implemented in the Python package rdflib to perform some lightweight searches of the triples, but what if you want to perform more complex queries without having to create a local triple store? In this post we’ll see how to use the declarative language of SPARQL to perform complex queries on a local RDF file (inspired by this Stackoverflow post).

SPARQL from Python

July 05, 2018

SPARQLWrapper is a simple Python wrapper around a SPARQL service for remote query execution. Not only does it enable us to write more complex queries to extract information from RDF than those exposed through a library like rdflib, it can also convert query results into other formats like JSON and CSV!