Posts

  • Four Years Later

    Sorta Insightful turns four years old today! Whether you were here from the beginning, or just discovered this blog, thanks for reading.

    I like to do an annual meta-post every year about blogging. This year’s post is a bit rushed since I’ve been doing MUMS Puzzle Hunt the past 2 weeks, but I knew the risks going in.

    Statistics

    Word Count

    Last year, I wrote 24,449 words. This year, I wrote 21,878 words.

     1,530 2018-08-18-three-years.markdown  
     3,602 2018-11-10-neopets-economy.markdown  
     2,019 2018-11-27-go-explore.markdown  
     2,773 2019-01-26-mh-2019.markdown  
     3,136 2019-02-22-alphastar.markdown  
     3,087 2019-02-22-alphastar-part2.markdown  
     1,075 2019-04-13-openai-finals.markdown  
     2,621 2019-05-26-iclr19.markdown  
       343 2019-06-04-arxiv-rl.md  
       696 2019-07-11-still-here.markdown  
       996 2019-07-27-lagrange-multipliers.markdown  
    21,878 total  

    In all honesty, I’m surprised it isn’t lower. I haven’t spent as much time blogging this year.

    I wrote 10 posts this year, under my trend of 1 post a month on average. This counts the AlphaStar post as a single post, even though it was split into 2 parts, because both parts were written and released at the same time.

    View Counts

    These are the view counts from August 18, 2018 to today, for the posts I’ve written this year.

        524 2018-08-18-three-years.markdown  
      2,459 2018-11-10-neopets-economy.markdown  
      4,651 2018-11-27-go-explore.markdown  
        716 2019-01-26-mh-2019.markdown  
      4,901 2019-02-22-alphastar.markdown  
      2,948 2019-02-22-alphastar-part2.markdown  
        731 2019-04-13-openai-finals.markdown  
        853 2019-05-26-iclr19.markdown  
        315 2019-06-04-arxiv-rl.md  
        247 2019-07-11-still-here.markdown  
      1,055 2019-07-27-lagrange-multipliers.markdown  

    When you remove outliers, this is higher than last year. I’m pleasantly surprised my Neopets economy post got solid viewership. It’s one of my favorites. I mean, on a reread, I’d change a bunch of things about the writing, but that’s what normally happens when you read your old work.

    Time Spent Writing

    Excluding time spent on this post, my time tracker says I’ve spent 88 hours, 12 minutes writing for my blog this year. This is about 2/3rds of the time I spent last year, and explains why it felt like I didn’t blog as much this year: I genuinely didn’t!

    The word count doesn’t reflect this, but that’s expected. The rather frustrating thing about writing is the complete lack of correlation between time spent and words produced. If anything, you spend a lot of time making sure you use fewer words. The better you do this, the less people recognize it.

    If I am to speak ten minutes, I need a week for preparation; if fifteen minutes, three days; if half an hour; two days; if an hour, I am ready now.

    (Woodrow Wilson. Well, maybe..)

    Posts in Limbo

    To be honest, I do still want to write that post about Gunnerkrigg Court. I’ve been talking about writing that post for over two years. One day, it’ll happen. It has to.

    (Me, last year)

    Oh, past me, how wrong you were.

    My interest in Gunnerkrigg Court has cooled down a lot ever since the conclusion of the Jeanne arc. Don’t get me wrong, it’s still a great webcomic. However, it’s been less emotionally resonant for me this year. There have been a few big events, but right now it feels like the comic is in search of a new overarching mystery to guide the plot. The chapters so far have felt like aftershocks of the Jeanne conclusion, rather than a new plot. I’m sure it’ll get to a new driving question soon, but with the story unfolding at Webcomic Time, it’s going to take a while to get there.

    I still want to write a Gunnerkrigg Court post. But I want to write it in the sense that I like the idea of writing it. That isn’t the same as actually doing it.

    One way to view decision making is that your choices depend on two things: your mental state, and your environment. Side projects, like this blog, usually don’t depend on the environment, because they have little to no deadlines or accountability.

    I, at some point, reach a mental state where I think “Gee, I should write a post about Gunnerkrigg Court”. But I’m busy, or want to write other posts, so I don’t write about Gunnerkrigg Court. This decision sends me on a trajectory that arrives at the same mental state where I think “Gee, I should write a post about Gunnerkrigg Court”.

    Because there’s no accountability, unless something has changed in my thinking since last time, my next decision will be the same: I’ll say I’m busy and do something else. Since my choice was the same, it sends me on the same trajectory as before, that arrives at the same point as before, creating a loop. Until something changes, my behavior will follow the same cyclic pattern where nothing happens.

    For decisions that depend on the environment, you can break these loops by changing the environment. Deadlines are a good example of this. I know that I want to make a post for my blogging anniversary. This deadline is part of the environment. It arrives on August 18th, it’s inevitable, and there’s no way to change it. That deadline stops me from putting it off forever, because it changes the state. First I have a week to write the post. Then, I have six days. Five days, four days - eventually, the time limit becomes dire enough that I can’t avoid it. If there isn’t a deadline, or if I don’t respect the self-imposed deadline, then this doesn’t work. It’s easy to get stuck when the state doesn’t change.

    The hard way to break these loops and not get stuck is to simply resolve not to get stuck and Actually Do Things. This means recognizing that if you don’t work on something now, you may never work on it in the future, because you’re implicitly deciding what you’ll do in any future decision point sufficiently similar to the current one. Maybe you’re okay with that. Maybe you aren’t. If you aren’t, then you should just Do The Thing, or at least Start Doing The Thing. Of course, it’s hard to Actually Do Things, but it’s why I respect people who do so.

    (A lot of this section is inspired by conversations I’ve had with friends that self-identify as rationalists, who tried explaining timeless decision theory to me. I didn’t get timeless decision theory at all, but it turns out that you don’t have to understand something to get useful ideas out of it.)

    What’s Next?

    I’m planning to make more of an effort on blogging this year. In case I don’t write any of the planned posts, here are some notes on posts I’ve been meaning to write. That way, if they never materialize, you can fill in the blanks yourself.

    Measurement

    Topic: A post about measurement. Not at the level of “how is a meter defined”, although I do like that discussion. It would be more about how measurements are inherently imperfect pictures of the reality we’re trying to measure, where imperfections are created by the process generating the measurements. Understanding this is important for explaining why statistics and machine learning can give misleading results, as well as the limitations and risks of ML.

    Why I want to Write It: It’s a useful mental concept that explains why more data isn’t always good. It’s something that experienced ML people understand, but it takes a while for the lessons to sink in. As machine learning gets applied to more of society, it’s crucially important that non-ML people become ML-literate. Not at the level of understanding the math behind machine learning, but at the level of understanding the concerns of blindly applying machine learning to arbitrary problem domains.

    Estimated Time to Complete: Very high. I’ve thought about this a while, and could write a bad post pretty quickly, but I want to explain it right, and that’ll require a lot of careful writing.

    Odds I Write This, Eventually: 95%

    Odds I Write This by the End of 2019: 20%

    Gunnerkrigg Court

    Topic: A post about Gunnerkrigg Court. What it does well, why I find its story fun, and why you should read it.

    Why I Want to Write It: Gunnerkrigg Court is one of my favorite webcomics of all time, and although I have a short blurb about it on my Webcomics Recommendation page, I’ve felt I have more to say about it than just that blurb suggests.

    Estimated Time to Complete: Medium-High. I’m not sure what I want to say but I don’t think there’s a lot of it.

    Odds I Write This, Eventually: 70%

    Odds I Write This by the End of 2019: 25%

    My Little Pony

    Topic: A post about attending BronyCon 2019, and my thoughts on what My Little Pony means to me and the community that grew around it, along with what I’ve learned about how communities are created and become self-sustaining.

    Why I Want to Write It: I’ve been meaning to write a post about My Little Pony for a while, but had trouble deciding where to begin. It was too big for me to explain. I think I can use BronyCon 2019 as a lens to focus thoughts into more specific ideas, and if I have unused material I can make follow-on posts later.

    Estimated Time to Complete: High. Will likely involve lots of introspecting, and figuring out the right phrasings for implicit concepts that I haven’t put into words before.

    Odds I Write This, Eventually: 85%

    Odds I Write This by the End of 2019: 50%

    Dominion Online

    Topic: A post about the history of Dominion Online, and the many trials it’s been through. This is the same post I mentioned in a previous post.

    Why I Want to Write It: I’ve been part of the competitive Dominion community for many years. The list of people who have both lived through all the drama and want to document it is very, very short. It’s pretty much just me, as far as I know, so if I don’t write it, no one will.

    Estimated Time to Complete: High. I know exactly what I want to say, but there are a lot of sources to look up, and the sheer length of what I want to write is daunting. The only reason this isn’t “Very high” is because I have the first 1/3rd written already.

    Odds I Write This, Eventually: 85%

    Odds I Write this By the End of 2019: 35%

    Comments
  • A Lagrange Multipliers Refresher, For Idiots Like Me

    What Are Lagrange Multipliers?

    Lagrange multipliers are a tool for doing constrained optimization. Say we are trying to minimize a function \(f(x)\), subject to the constraint \(g(x) = c\).

    \[\begin{aligned} \min &\,\, f(x) \\ \text{subject to} &\,\, g(x) = c \end{aligned}\]

    To solve this, you define a new function.

    \[\mathcal{L}(x, \lambda) = f(x) - \lambda (g(x) -c)\]

    The optimum lies at a stationary point of \(\mathcal{L}\) (a point where the gradient in \(x\) and \(\lambda\) is both zero).

    This is all true, but it doesn’t explain why it works.

    Why Do Lagrange Multipliers Work?

    Let’s consider a variant of the problem. We want to minimize \(f(x)\) subject to \(g(x) \ge 0\).

    \[\begin{aligned} \min &\,\, f(x) \\ \text{subject to} &\,\, g(x) \ge 0 \end{aligned}\]

    Let’s define the following min-max optimization problem.

    \[\min_x \max_{\lambda \ge 0} f(x) - \lambda g(x)\]

    I claim the solution \(x\) of this optimization problem occurs at the smallest \(f(x)\) that satisfies the constraint \(g(x) \ge 0\). Why?

    As written, we first choose an \(x\), then choose a \(\lambda\) that maximizes the objective, and we want to choose an \(x\) that minimizes how much an adversarial \(\lambda\) can hurt us. Suppose we are violating the constraint \(g(x) \ge 0\). Then we have \(g(x) < 0\). At such an \(x\), \(-g(x) > 0\), so we can pick \(\lambda = \infty\) to drive the objective value to \(\infty\).

    If we are not violating the constraint, then \(-g(x)\) is \(0\) or negative, and since \(\lambda\) is constrained to \(\lambda \ge 0\), the optimal \(\lambda\) is \(\lambda = 0\), giving objective value \(f(x)\).

    In other words, the solution of the unconstrained problem is the same as the solution to the original constrained problem.

    This handles the \(g(x) \ge 0\) case. What if we have multiple constraints?

    \[\begin{aligned} \min &\,\, f(x) \\ \text{subject to} &\,\, g_1(x) \ge 0 \\ &\,\, g_2(x) \ge 0 \end{aligned}\]

    We can define a similar min-max problem by adding a Lagrange multiplier \(\lambda_i\) for each constraint \(i\).

    \[\min_x \max_{\lambda_1,\lambda_2 \ge 0} f(x) - \lambda_1 g_1(x) - \lambda_2 g_2(x)\]

    By a similar argument, the optimal solution to this min-max problem occurs at \(\lambda_1 = 0, \lambda_2 = 0\), and \(x\) at the solution of the original constrained optimization problem, assuming it exists. If either constraint was violated, then we could have driven the corresponding \(\lambda_i\) to \(\infty\), like before.

    What if we have a constraint \(g(x) \ge c\)? We can rearrange it to the constraint \(g(x) - c \ge 0\).

    \[\begin{aligned} \min &\,\, f(x) \\ \text{subject to} &\,\, g(x) - c \ge 0 \end{aligned}\]

    This is solved the same way.

    \[\min_x \max_{\lambda \ge 0} f(x) - \lambda (g(x) - c)\]

    If we have a constraint \(g(x) \le 0\), we can negate both sides to get \(-g(x) \ge 0\). If we have a constraint \(g(x) \le c\), we can rearrange it to \(c - g(x) \ge 0\). This reduces everything to the \(g(x) \ge 0\) case we know how to solve.

    Bringing it back to the equality case, let’s say we have a constraint \(g(x) = c\). How do we reduce this to the previous cases? This equality is the same as the pair of constraints \(g(x) \ge c, g(x) \le c\), which are only both satisfied at \(g(x) = c\). This time, I’ll write the Lagrange multipliers as \(\lambda_+\) and \(\lambda_-\)

    \[\begin{aligned} \min &\,\, f(x) \\ \text{subject to} &\,\, g(x) \ge c \\ \text{subject to} &\,\, g(x) \le c \end{aligned}\] \[\min_x \max_{\lambda_+, \lambda_- \ge 0} f(x) - \lambda_+ (g(x) - c) - \lambda_- (c - g(x))\]

    Like before, if we ever have \(g(x) \neq c\), then we can choose \(\lambda_+, \lambda_-\) such that the objective value shoots up to \(\infty\). But, since \(g(x) - c\) and \(c - g(x)\) are just negations of each other, we can simplify this further.

    \[\min_x \max_{\lambda_+, \lambda_- \ge 0} f(x) - (\lambda_+ - \lambda_-) (g(x) - c)\]

    It’s possible to make \(\lambda_+ - \lambda_-\) equal any real number while still satisfying \(\lambda_+ \ge 0, \lambda_- \ge 0\), so let’s just replace \(\lambda_+ - \lambda_-\) with \(\lambda\), and say \(\lambda\) can be any real number instead of only nonnegative ones.

    \[\min_x \max_{\lambda} f(x) - \lambda (g(x) - c)\]

    Now, we’re back to the equality case we started at. The solution to this must lie at a saddle point where the gradient with respect to \(x\) and \(\lambda\) is both zero.

    Why Do I Like This Explanation?

    When I was re-learning Lagrange multipliers a while back, I was upset that all the most popular results were targeted towards people taking their first multivariate calculus course. These explanations exclusively covered the \(g(x) = c\) case, and the constraints I wanted to add were more like \(a \le g(x) \le b\).

    It’s a shame that most people’s first introduction to Lagrange multipliers only covers the equality case, because inequality constraints are more general, the concepts needed to understand that case aren’t much harder, and it’s clearer how you’d apply an optimization algorithm to solve the resulting unconstrained problem. Like all other min-max optimization (GANs, actor-critic RL, etc.), doing alternating gradient descent updates on \(x\) then \(\lambda\) is both simple and works out fine.

    This procedure doesn’t guarantee that your constraint \(g(x) \ge c\) is met over the entire optimization process, but it does quickly penalize the optimization for violating the constraint, since the \(\lambda\) for that constraint will quickly rise and keep rising until the constraint is satisfied again, at which point \(\lambda\) will quickly regress towards \(0\) until it is needed once more.

    As one final detail, handling inequalities requires handling optimization over \(\lambda \ge 0\). The common trick is to rewrite \(\lambda\) as \(\lambda = \exp(w)\) or \(\lambda = \text{softplus}(w)\), then do gradient descent over variable \(w\) instead.

    Comments
  • I'm Still Here

    I normally like my blog posts to have a coherent theme to them. The problem is that this takes time, which I haven’t had because I’ve been putting it into other side projects instead. I got inspired recently, and figured, screw it, I’ll just write a grab bag blog post to remind people this blog still exists, and go from there.

    The first side project I’ve been working on is a history of Dominion Online. Dominion is a card game I was introduced to in 2010, and I took a liking to it right away. I got deep enough into it that I started playing competitively, participating in online tournaments, and so on. These days I don’t play it very often, but I still stick around the community. The online version of Dominion has a rich, storied history. Many of the older members who’ve seen it all have left Dominion, and I’m realizing I’m one of the few people who both experienced all of it firsthand and wants to make sure those stories aren’t forgotten. Hence, the side project. Writing it has been really fun, but also really time-consuming. One restriction I have is that everything needs to have a primary source, and this means searching through 8 year old forum threads and archive.org snapshots for defunct companies to get links for everything. I have no idea who will read this, but it’s important to me.

    My second side project is running a puzzlehunt themed around My Little Pony. I’ve thought about doing this for a while, and when Hasbro announced the final season of My Little Pony: Friendship is Magic would air this year, I realized it was do or die. I don’t know when the puzzlehunt will be finished, but the hunt has passed the point where it’ll for-sure exist in some form. I’ve gotten a few people roped into constructing and testsolving, the rough puzzle and story structure is in place, and metapuzzles are getting testsolved as I speak. Funnily enough, I’m the only brony out of all the puzzle constructors and testsolver so far, and I’m still surprised everyone involved is putting up with my nonsense. If you’re interested in testsolving or constructing, email me, and if you’d like to solve it when it’s done, stay tuned.

    In AI for Games news, Blizzard announced that AlphaStar will be playing on the live SC2 competitive ladder, to evaluate agent performance in blind trials. AlphaStar can now play as all three races, has more restrictive APM limits than their previous version, and has been trained to only observe information from units within the current camera view. Should be fun - I expect it to beat everyone in blind games by this point.

    And today, Facebook and CMU announced they’ve developed Pluribus, a poker bot that can beat pros at 6-player no-limit Texas Hold’em. The Science paper is available here, and the Facebook blog post linked above is both pretty informative and free of all the media hype that will surely appear around this result. I say Facebook and CMU, but it’d be more accurate to credit the authors: Noam Brown and Tuomas Sandholm, the two authors of Libratus, and now Pluribus. Congrats!

    I have two predictions about the reception. The first prediction is that there will be a ridiculously wide misconception that Pluribus is learning to read pro poker players and adapt its strategy to beat them. This isn’t what’s happening - it’s more like Pluribus is playing against imagined versions of itself that are more prone to folding or calling or raising, and using this to learn poker strategies that are less exploitable. The second prediction is that the news media is going to milk the “too dangerous to release the code” angle for all it’s worth. I likely wouldn’t release the code either, but last I heard, making money in online poker has been dicey ever since online poker’s Black Friday. The biggest consequence will likely be that scrubs will now believe all their opponents are using superhuman poker bots, to avoid facing the reality that they suck at poker.

    Comments