James Long [address withheld] San Jose, CA 95148 Home : [email for more information] Cell : [email for more information] Email: jamlong [at] gmail [dot] com ----------------------------------------------------------------------------- OBJECTIVE I am currently open to hear about external opportunities. My interests are largely related to abuse problems (Spam, Policy Abuse, Payment Fraud). Specifically, I tend to work on Machine Learning problems where there is so little data that semi-supervised or unsupervised approaches tend to be necessary. More generally: I enjoying transforming data into money through machine learning. ----------------------------------------------------------------------------- EMPLOYMENT HISTORY January 2017 - Present Staff Software Engineer - Google - Common Abuse Tools Working on Applied Research in the area of Abuse Prevention. Similarity - Working on a end to end pipeline to do general similarity, utilizing Triplet Loss and Quadruplet Loss. Simplifying the process so the user can simply provide their dataset, and an embedding model for a single instance, and creating a full Triplet Loss model (including optional auxillary tasks such as Image Rotation, Autoencoders, etc.). KerasTuner - https://elie.net/talk/cutting-edge-tensorflow-keras-tuner-hypertuning-for-humans/ Early adopter. Assisted greatly in the refactoring and production readiness of KerasTuner. https://github.com/keras-team/keras-tuner SCAAML - Automated the running of tens of thousands of experiments on Side Channel Attacks against crypto keys. Re-wrote and open-sourced parts of the SCAAML codebase: https://github.com/google/scaaml Created an End-to-end Attack Colab/Jupyter notebook: https://colab.research.google.com/drive/1VmqDxgvUltgDAvyc1RKGUdJS9Cn6U4GV June 2016 - December 2016 Staff Software Engineer - Google - Youtube Enforcement Planning the future of Youtube Enforcement across Engagement Abuse, Video Abuse, and Content Spam, eliminating redundancy and improving our overall Abuse efforts. Creating signals and models to detect Spam in Youtube. Strategy / C++ / Distributed Systems / Decision Trees / AdaBoost / SVMs Data Analysis / SQL January 2014 - June 2016 Staff Software Engineer - Google Research Freud - Machine Learning Platform - Latent Dirichlet Allocation Productionized a distributed implementation of Latent Dirichlet Allocation, an algorithm for generate Topic Models, and ran models at massive scale, stretching our capabilities (x,xxx machines, xx,xxx CPUs, xxx TB RAM) Paper published in OSDI '14 - http://goo.gl/6wUmZY Identified potential users, pitched how we could help them, and consulted with interested teams across Google (Youtube, Search, News, GMail, Health, and others). Generated LDA models based on user search data, and launched changes in Search Ads, netting a whopping total of $0.X Billion / year in incremental revenue. C++ / SOA / Distributed Systems / Flume / Machine Learning / LDA / Research November 2011 - January 2014 Staff Software Engineer - Google - Payment Fraud Tech Lead for Risk Infrastructure (7 developers), owning the infrastructure used to detect fraud and abuse in all paid Google products (AdWords, Checkout, Pay By Gmail, Enterprise, etc.) Lead a major project (~2 years, 3 engineers) to build a custom database and query engine tailored to our teams needs. Final system handles 350,000 read QPS and 100,000 update QPS, while reducing latency of our signal collection by 98%, and turning down 3 legacy systems used for similar signals. Massive productivity gains for ~30 engineers working on signals. Owned the first models for detecting policy abuse in AdWords (scams, counterfeit goods, etc.), leading to a 10x increase in the number of accounts (correctly!) shut down for policy violations over what human reviewers were able to do. Created the first models for detecting Account Take Over (ATO) through phishing / malware, reducing fraud losses by ~$24M/year. Created a system for automatically detecting duplicate/related customers, allowing a single detected abuser to potentially cascade into a large number of account suspensions. Java / Bigtable / MapReduce / Flume / Distributed Systems / SOA Machine Learning / SVMs / Decision Trees and Forests / Signals Technical Leadership (~7 engineers) February 2007 - November 2008 Site Reliability / Software Engineer III - Google - Apps for your Domain Tech Lead for Apps for Your Domain (now GSuite) - Site Reliability. Troubleshoot, improve, and handle on-call for teams like GMail, Apps for Your Domain, and Google's authentication service. Troubleshooting / Load Balancing / Site Reliability Technical Leadership (~6 engineers) July 2002 - January 2007 Software Developer II - Amazon.com - Personalization Automated our flagship marketing e-mail campaign, reducing the number of humans needed by 10+, while expanding the number of products we could promote. Fully automated e-mail campaigns generated ~$1M in yearly revenue. Maintained production environment, including specialized hardware for sending transactional and merchandising e-mail. Lead several small team (2-3 engineer) projects. C++ / Java / SOA / Perl / Oracle / MySql / Recommendations / SMTP Team Leadership (~3 engineers) August 2001 - May 2002 Freelance Developer - NetMath at the University of Illinois - Urbana-Champaign Created a virtual classroom / homework hand-in system, which was used for more than a decade. PHP / MySQL / HTML Summer 2001 Software Development Intern - Microsoft Media Player Created a JNI based plugin for Windows Media Player (to allow embedding in Netscape without licensing issues. Java / JNI / C++ / Javascript Summer 2000 Software Development Intern - Microsoft Learning Technologies Group Created a prototype of an online classroom, with streaming video/audio, user-driven questions, polls, notes, and more. HTML / Javascript / COM ---------------------------------------------------------------------------- EDUCATION 1998 - 2002 University of Illinois at Urbana-Champaign Bachelor of Science - Computer Science James Scholar (http://jamesscholar.cen.uiuc.edu/) -------------------------------------------------------------------------- REFERENCES Available upon request.