Crash đź’Ą notes for word embeddings, word2vec, GloVe, and all that

The problem: you want to compute distances or you want to do a regression of some sort but your data is not numeric; it is words. How do we turn words into numbers so we can make computations? The aim of this post: give an overview of effective methods available for converting words into numbers… Continue reading Crash đź’Ą notes for word embeddings, word2vec, GloVe, and all that

A lottery problem

Alice and Bob play the lottery every day. At the end of the day, a winner is announced. The winner can be either Alice, Bob, or nobody. Of course, since Alice and Bob are playing every day they can win multiple times (and also lose multiple times). Since there can only be 1 winner, Alice… Continue reading A lottery problem

AI progress in 10 years: from cats and dogs to full-on descriptions

Measuring the progress of AI can be tricky. How do you know if Google’s AI is smarter than Amazon’s? What does “smarter” mean anyway? In 2006, AI researcher Fei-Fei Li at Stanford University came up with an idea. She created the ImageNet competition where every year different research groups can have their AIs compete for… Continue reading AI progress in 10 years: from cats and dogs to full-on descriptions

Did the early Shelter-In-Place order do anything for the SF Bay Area? Synthetic control for answering a what-if.

On March 16, 2020, the San Francisco By Area became the first region in the United States to issue a “shelter-in-place” order to slow the spread of COVID-19. Just a few days later, on March 20 Governor Gavin Newsom declared a state wide SIP order that has persisted to this date.  Did starting SIP a… Continue reading Did the early Shelter-In-Place order do anything for the SF Bay Area? Synthetic control for answering a what-if.

Learning from noisy labels with positive unlabeled learning

We’ll discuss a simple trick to deal with the case where we have positive examples only and unlabeled examples that could be either positive or negative (or have been heavily mislabeled and can be treated as unlabeled). This is a type of semi-supervised learning where we do not have access to the full labeled data… Continue reading Learning from noisy labels with positive unlabeled learning

Extracting features from pretrained neural networks in Caffe using SkiCaffe

SkiCaffe is a wrapper that provides a “scikit-learn like” API to pretrained networks such as those distributed in the Caffe Model Zooor elsewhere (such as DeepDetect). Basically, I wanted to use these pretrained models for extracting features, but also use the powerful pipelines of scikit-learn. Here we illustrate it’s basic use for extracting features. In Scikit-learn parlance,… Continue reading Extracting features from pretrained neural networks in Caffe using SkiCaffe

Visualizing Loss Functions for Neural Networks: where are all the local minima?

In engineering, finance, and many other computational sciences, we often need to make estimations or decisions based on collected data. In machine learning, for example, we want to build a neural network that “decides” what object is in an image. In real estate finance, investors want to estimate the price of a home based on previous sold… Continue reading Visualizing Loss Functions for Neural Networks: where are all the local minima?

An intuitive explanation of the Xavier Initialization for Deep Neural Networks

The motivation for Xavier initialization in Neural Networks is to initialize the weights of the network so that the neuron activation functions are not starting out in saturated or dead regions. In other words, we want to initialize the weights with random values that are not “too small” and not “too large.” Take a single… Continue reading An intuitive explanation of the Xavier Initialization for Deep Neural Networks

My 2016 International Conference on Machine Learning Musings

This years ICML, one of the top machine learning conferences, took place in New York City and I had the good fortunate of attending. With tutorials, sessions, and workshops, the conference spanned 6 days. It was an intense week with many overlapping sessions and obviously it was not possible for me to attend all of… Continue reading My 2016 International Conference on Machine Learning Musings