Aaron Beppu

Logo

Aaron Beppu's personal site. Not presently a blog.

View My GitHub Profile

Aaron Beppu

I’m an experienced engineer with interests in machine learning, large-scale data analysis, data engineering, real-time systems and related product development.

I’m seeking to bring these skills to applications which can benefit society.

Contact me

Experience

2013 - 2021, Sr to Principal Engineer, Sift (San Francisco, CA)

I have been a key contributor within engineering during a period of continued and transformative growth. During that period I have driven many of the projects underpinning that transformation. Here’s a sample:

Data hacking:

ML tooling and platform:

Product:

Platform/Infra:

Org Hacking:

Feb - July 2013, Software Engineer, Prismatic (San Francisco, CA)

Built and improved a range of backend services, including topic modeling, document life-cycle, and social media integrations.

Jan 2011 - Jan 2013, Software Engineer, Etsy (New York, NY)

Data-mining system to improve search ranking. (e.g. see my Hadop World 2011 presentation)

Big data tools and infrastructure:

Jun 2008 - Dec 2010, Software Development Engineer, A9.com (Palo Alto, CA)

Wrote and improved jobs for large-scale click-stream analysis. Aggregating and reporting data about product search and search quality.

Education

2005 - 2008
BA, Cognitive Science; UC Berkeley, Departmental Citation

Public communications

While most of my work has not been directed at public release, here are some publicly visible artifacts:

Patent US10339472B2
Technical blog Models in Disguise: How Sift Science Ships Non-Disruptive Model Changes
Public speaking Non-disruptive Model Changes
Academic publication Beppu, Aaron & Griffiths, Thomas L. (2009). Iterated learning and the cultural ratchet. In N. A. Taatgen & H. van Rijn (eds.), Proceedings of the 31st Annual Conference of the Cognitive Science Society. pp. 2089–2094.

Technologies

Languages I have used in production: java, scala, python, clojure, javascript, php

libraries/frameworks/services: hadoop, spark, kafka, hbase, elastic search, solr, mysql, postgres, memcached, airflow, oozie, protobuf, avro, thrift, AWS services (many), GCP services (several)