James Long
[address withheld]
San Jose, CA 95148
Home : [email for more information]
Cell : [email for more information]
Email: jamlong [at] gmail [dot] com
-----------------------------------------------------------------------------
OBJECTIVE
I am currently open to hear about external opportunities. My interests are
largely related to abuse problems (Spam, Policy Abuse, Payment Fraud).
Specifically, I tend to work on Machine Learning problems where there is so
little data that semi-supervised or unsupervised approaches tend to be
necessary.
More generally: I enjoying transforming data into money through machine
learning.
-----------------------------------------------------------------------------
EMPLOYMENT HISTORY
January 2017 - Present
Staff Software Engineer - Google - Common Abuse Tools
Working on Applied Research in the area of Abuse Prevention.
Similarity -
Working on a end to end pipeline to do general similarity, utilizing
Triplet Loss and Quadruplet Loss. Simplifying the process so the user can
simply provide their dataset, and an embedding model for a single instance,
and creating a full Triplet Loss model (including optional auxillary tasks
such as Image Rotation, Autoencoders, etc.).
KerasTuner -
https://elie.net/talk/cutting-edge-tensorflow-keras-tuner-hypertuning-for-humans/
Early adopter. Assisted greatly in the refactoring and production readiness
of KerasTuner.
https://github.com/keras-team/keras-tuner
SCAAML -
Automated the running of tens of thousands of experiments on Side Channel
Attacks against crypto keys.
Re-wrote and open-sourced parts of the SCAAML codebase:
https://github.com/google/scaaml
Created an End-to-end Attack Colab/Jupyter notebook:
https://colab.research.google.com/drive/1VmqDxgvUltgDAvyc1RKGUdJS9Cn6U4GV
June 2016 - December 2016
Staff Software Engineer - Google - Youtube Enforcement
Planning the future of Youtube Enforcement across Engagement Abuse, Video
Abuse, and Content Spam, eliminating redundancy and improving our overall
Abuse efforts.
Creating signals and models to detect Spam in Youtube.
Strategy / C++ / Distributed Systems / Decision Trees / AdaBoost / SVMs
Data Analysis / SQL
January 2014 - June 2016
Staff Software Engineer - Google Research
Freud - Machine Learning Platform - Latent Dirichlet Allocation
Productionized a distributed implementation of Latent Dirichlet Allocation,
an algorithm for generate Topic Models, and ran models at massive scale,
stretching our capabilities (x,xxx machines, xx,xxx CPUs, xxx TB RAM) Paper
published in OSDI '14 - http://goo.gl/6wUmZY
Identified potential users, pitched how we could help them, and consulted with
interested teams across Google (Youtube, Search, News, GMail, Health, and
others).
Generated LDA models based on user search data, and launched changes in Search
Ads, netting a whopping total of $0.X Billion / year in incremental revenue.
C++ / SOA / Distributed Systems / Flume / Machine Learning / LDA / Research
November 2011 - January 2014
Staff Software Engineer - Google - Payment Fraud
Tech Lead for Risk Infrastructure (7 developers), owning the infrastructure
used to detect fraud and abuse in all paid Google products (AdWords, Checkout,
Pay By Gmail, Enterprise, etc.)
Lead a major project (~2 years, 3 engineers) to build a custom database and
query engine tailored to our teams needs. Final system handles
350,000 read QPS and 100,000 update QPS, while reducing latency of our signal
collection by 98%, and turning down 3 legacy systems used for similar signals.
Massive productivity gains for ~30 engineers working on signals.
Owned the first models for detecting policy abuse in AdWords (scams,
counterfeit goods, etc.), leading to a 10x increase in the number of accounts
(correctly!) shut down for policy violations over what human reviewers were
able to do.
Created the first models for detecting Account Take Over (ATO) through
phishing / malware, reducing fraud losses by ~$24M/year.
Created a system for automatically detecting duplicate/related customers,
allowing a single detected abuser to potentially cascade into a large number
of account suspensions.
Java / Bigtable / MapReduce / Flume / Distributed Systems / SOA
Machine Learning / SVMs / Decision Trees and Forests / Signals
Technical Leadership (~7 engineers)
February 2007 - November 2008
Site Reliability / Software Engineer III - Google - Apps for your Domain
Tech Lead for Apps for Your Domain (now GSuite) - Site Reliability.
Troubleshoot, improve, and handle on-call for teams like GMail, Apps
for Your Domain, and Google's authentication service.
Troubleshooting / Load Balancing / Site Reliability
Technical Leadership (~6 engineers)
July 2002 - January 2007
Software Developer II - Amazon.com - Personalization
Automated our flagship marketing e-mail campaign, reducing the number
of humans needed by 10+, while expanding the number of products we
could promote. Fully automated e-mail campaigns generated ~$1M in yearly
revenue.
Maintained production environment, including specialized hardware for sending
transactional and merchandising e-mail.
Lead several small team (2-3 engineer) projects.
C++ / Java / SOA / Perl / Oracle / MySql / Recommendations / SMTP
Team Leadership (~3 engineers)
August 2001 - May 2002
Freelance Developer - NetMath at the University of Illinois - Urbana-Champaign
Created a virtual classroom / homework hand-in system, which was used for
more than a decade.
PHP / MySQL / HTML
Summer 2001
Software Development Intern - Microsoft
Media Player
Created a JNI based plugin for Windows Media Player (to allow embedding
in Netscape without licensing issues.
Java / JNI / C++ / Javascript
Summer 2000
Software Development Intern - Microsoft
Learning Technologies Group
Created a prototype of an online classroom, with streaming video/audio,
user-driven questions, polls, notes, and more.
HTML / Javascript / COM
----------------------------------------------------------------------------
EDUCATION
1998 - 2002 University of Illinois at Urbana-Champaign
Bachelor of Science - Computer Science
James Scholar (http://jamesscholar.cen.uiuc.edu/)
--------------------------------------------------------------------------
REFERENCES
Available upon request.