Last updated September 25, 2017.

My current research interests are transfer learning and deep reinforcement learning, with a bias towards their applications to robotics.

This page is divided into post-undergrad projects and undergrad projects. Post-undergrad projects are significantly lower quality, but I leave them up because I’m still slightly proud of what I accomplished, given what I knew at the time.

Project list:


Post-Undergrad Work

Using Simulation and Domain Adaptation to Improve Efficiency of Deep Robotic Grasping

Konstantinos Bousmalis*, Alex Irpan*, Paul Wohlhart*, Yunfei Bai, Matthew Kelcey, Mrinal Kalakrishnan, Laura Downs, Julian Ibarz, Peter Pastor, Kurt Konolige, Sergey Levine, Vincent Vanhoucke. Asterisk indicates equal contribution.

Paper and Video: here

Combines several domain adaptation techniques (both feature-level and pixel-level) to learn real-world grasping policies from monocular RGB images.

Learning Hierarchical Information Flow with Recurrent Neural Modules

Danijar Hafner, Alex Irpan, James Davidson, Nicolas Heess

Paper: here

Proposes ThalNet, a network architecture inspired by the thalamus in the brain. Accepted to NIPS 2017.


Undergrad Work

Exploring Boosted Neural Nets for Rubik’s Cube Solving

Alex Irpan

Spring 2016. Final project for CS 281B, Advanced Topics in Decision Making

Poster (click for full size):

Poster

Paper: here

Code: GitHub

Applies AdaBoost to a online data stream to aid in training a neural net to classify the next move in a Rubik’s Cube solution.

Factored Q-Learning in Continuous State-Action Spaces

Alex Irpan, mentored by John Schulman

Fall 2015. Final project for CS 281A, Statistical Learning Theory

Poster (click for full size):

281A Poster

Informal Writeup: PDF

Code: BitBucket

Represents a Q-function as the sum of one Q-function for each dimension of the action space, aiming to avoid the curse of dimensionality in high-dimensional discretized MDPs.

An Overview of Sublinear Machine Learning Algorithms

Alex Irpan*, Ronald Kwan* (worked equally)

Spring 2015. Final project for CS 270, Combinatorial Algorithms and Data Structures

Report: PDF

A survey paper summarizing algorithms used to solve SDPs and learn SVMs in sublinear time.

Integrating Monte Carlo Tree Search with Reinforcement Learning

Alex Irpan, mentored by John Schulman

Fall 2014 - Spring 2015

Code: BitBucket

Regresses policy to target values generated by MCTS, with as the rollout policy. Note: done before AlphaGo was announced.