• The Berkeley TA Back Pay Settlement, Summarized

    I was last at Berkeley in Spring 2016, and it’s possible things have changed since then, but I’m aiming to represent the viewpoints as accurately as possible.

    On Tuesday, UAW Local 2865, the union representing TAs across the UC system, announced that UC Berkeley would pay $5 million in back pay to TAs. The story is getting picked up by a few places: The Chronicle of Higher Education, Inside Higher Ed, local Bay Area news outlets like The Mercury News and Santa Cruz Sentinel, and even some national outlets like Vice.

    In aggregate, these articles actually do a pretty good job of explaining the details, and a few of the different viewpoints, but as someone who TAed at Berkeley, in an 8 hour position, I’ve been feeling very conflicted.

    Why Does the University Owe Back Pay?

    At UC Berkeley, all TAs are paid an hourly wage. On top of this, all TAs who work at least 10 hours / week are entitled to childcare benefits and fee remission. The important part is the fee remission. If you TA for 10 hours a week or more, you are paid $7,500 for the semester on top of your hourly wage, which covers the in-state tuition for the semester.

    In the CS department, most TA positions used to be 10 hr / week or 20 hr / week positions. Starting around 2015-2016, many of these TA positions started turning into 8 hr / week appointments, making them ineligible for fee remission. Doing so let the department get more TA hours for the same amount of budget, since the pay that would have gone to fee remission gets turned into TA hours instead. They saw a welfare cliff, and decided to get as close as they could without going over.

    After this started spreading, the union filed a grievance against the University. Now, to be clear, nothing the University did was illegal. The contracts were clear that 8 hr / week positions were not eligible for fee remission, and the classes I helped teach made sure this was clear as well. However, the union argued that the University was effectively violating the spirit of the negotiated fee remission, by turning jobs that needed benefits into ones that didn’t. This is not a new practice. Companies have done this for a while, taking full-time jobs and turning them into jobs classified as contractors.

    The arbitrator ruled in favor of the union, and the University has agreed to cooperate with the decision.

    Why Did UC Berkeley Start Doing This?

    In the last few years, UC Berkeley has had a perpetual funding problem. This, plus exploding interest in CS courses, plus professor salaries rising due to competition from industry, combines to tons of strain on the CS department’s budget.

    In 2016, in light of protests by graduate student instructors (GSIs), there was a town hall to discuss the CS department’s budget, attended by professors, members of the union, and much of the CS department’s teaching staff, myself included. Throughout the town hall, the professors made it clear they supported the GSI protests, and would have hired TAs at 10 hr appointments if they had the funding for it. The Berkeley CS department does get some funding, but nowhere near enough to meet demand. The department does heavy outreach for donor support, using this to shore up the budget, but they don’t think it’s sustainable to rely on donors to the degree they are. They’ve repeatedly asked the Berkeley administration to give them more funding, and have consistently seen it go to non-academic areas, like athletics or more administrative jobs.

    One obvious solution was to restrict enrollment, instead of using this 8 hr TA loophole. However, CS enrollment is already insane. Some lower division courses literally have thousands of students. At the town hall, professors teaching these courses said they were happy to have the class be as large as possible, as long as there was TA support for it. At some point, the department decided that they’d rather have bigger classes than 10 hr TA appointments. My understanding was that they wanted this to be a one-time deal, but like the donor support, this trick became a normalized part of the budget.

    Were These 8 Hour TA Appointments Bad?

    It heavily depends on who you ask. Eight hour TA jobs were almost exclusively held by undergraduates, and in fact undergrads make up the majority of the CS department’s TA staff. This tends to surprise people, and can be interpreted as vaguely exploitative. Let me explain reasons it wasn’t.

    Undergrads started getting hired as TAs because Berkeley didn’t have enough grad student TAs to meet course demand. However, some professors found that undergraduate TAs did a better job than graduate TAs. For lower division courses, graduate and undergraduate students know the material equally well, but undergraduates actually took the course they were TAing. Grad students who learned it at different institutions were less familiar with how Berkeley taught the course.

    Additionally, some undergrad TAs would TA the same course several years in a row. This happened less with grad students, since after they met their TA requirements, they would switch to focusing on research. The increased continuity from undergrads made it easier to preserve course teaching culture and knowledge, which genuinely improved the quality of some classes.

    Finally, the increase in undergrad TAs was good for graduate school applications, since it gave more undergrads a connection to a professor who could eventually write them a letter of recommendation.

    From my point of view, these undergrad TA positions were a net positive for everyone involved.

    None of this is directly related to the union’s grievance. Fee remissions will be paid back to both undergrad TAs and graduate TAs. It is, however, indirectly related. One side effect of hiring many 8 hour TAs is that you have to hire more undergrads. More students got to hold TA positions, talk to professors, get letters of recommendation, and so on.

    You could argue this is Goodharting in action, since each professor gets less time to evaluate each TA. Maybe all it did was rubber-stamp more letters that said “this student helped me teach a course”, without actually saying anything useful for graduate school admissions. But in this instance, I don’t the incentives are entirely unaligned. Part of TAing is to help students practice teaching. I taught one section a week in my 8 hour TA appointment, while 20 hour TAs taught two. I’m sure I would have learned from the 2nd section per week, but the marginal benefit from 0th to 1st is much bigger than 1st to 2nd, and splitting the TA load across more people meant a lot more people got that 0th to 1st experience.

    My view is that the CS department set a bad precedent that wasn’t entirely bad. Eight hour TAs are not a good fit for the entire UC Berkeley campus, since it’s pretty clear that if it was universalized, no one would get fee remission or childcare benefits. However, for the unique situation the CS department was in, the outcome wasn’t terrible. As much as I’ve mentioned budget issues, the CS department has it pretty good, relative to other departments. Donors for CS are pretty rich, and I know several students who funded their education through tech company summer internships. Strong students could get $7,000 a month or more during the summer, often with a housing stipend, and that could cover in-state tuition, housing, and food until the next summer if you planned your budget right, even accounting for the insanity of Bay Area rent. Most departments do not have this luxury.

    The problem was that departments that didn’t have these luxuries would and were tempted to adopt similar policies, in a bid to fix budget problems of there own. Charts from UAW’s page show that the statistics department was starting to shift to the CS model. I think it’s good for the university to have 8 hour TA appointments go away, but I think it’s bad for the CS department to lose them.

    What’s Going to Happen Next?

    It’s still uncertain, but here’s my understanding based on discussion in the Berkeley Facebook student groups.

    First of all, existing TAs working less than 10 hours per week will continue to work the same amount, along with fee remissions. There is a contract negotiated by the union that prevents TAs from losing their jobs in the middle of the semester, so for now, this semester will play out as before.

    Starting next semester, TA hours are going to get more expensive. To minimize cost per hour, departments are incentivized to hire fewer TAs that work longer hours. Fee remission is a fixed cost paid once per student per semester, so you want that student to work as much as you can hire them for. In CS, this historically means 20 hours per week, but I’ve recently learned that Berkeley has 30 hour / week appointments in other departments, so it could go even higher. Fewer undergrad TAs will get to talk to professors, and fewer undergrads will sign up in the first place. If the only TA options were 20 hour appointments, I likely wouldn’t have taken any of them in my senior year, due to other time commitments.

    The administration will either need to allocate more TA budget, or CS class sizes will need to shrink. Historically, I’ve lost a lot of faith in the UC system and expect it to raise the budget by a token amount that doesn’t cover the shortfall. CS class enrollment was already effectively at capacity with the 8 hr / week loophole, so it has to drop. The math I saw was that four 8 hour TAs cost the same as one 20 hour TA. If the budget doesn’t increase, a shift to 20 hour TAs means 62.5% of the teaching hours as last semester. This is pretty crazy and I have no idea how they’ll even figure out enrollment.

    I’d like the union to negotiate higher pay per hour, in exchange for fee remissions, because one of the big lessons is that welfare cliffs can lead to bad consequences. If this happened, it would fix much of my issues with the current status quo, since professors could go back to offering many smaller TA appointments. However, it seems very unlikely the union will do this, and I’m not even a student anymore, so it’s not like I have much say in this.

    As with pretty much any story combining “UC Berkeley” with “budget”, it’s going to be a huge mess. Hopefully, this made it clearer why this decision was not a clear black-and-white victory for the workers, as much as some want to treat it that way.

  • What Size Should NeurIPS Be?

    Ostensibly, I’m on vacation. However, it’s raining, I have some inspiration, and I haven’t written a post in a while, so buckle up, here come some more machine learning opinions. I read some discussion about the size of NeurIPS, mostly around Andrey Kurenkov’s post at The Gradient, and wanted to weigh in.

    I’ve been to three NeurIPS: 2016, 2017, and 2019. So, no, I haven’t really been around that long. NeurIPS 2016 was my first academic conference ever, so I didn’t really know what to expect. By NeurIPS 2017, I’d been to a few and could confidently say that NeurIPS felt too big. By NeurIPS 2019, I was no longer sure NeurIPS was too big, even though it had over 60% more attendees than 2017.

    Before my first conference, I got some advice from senior researchers: if you aren’t skipping talks, you’re doing it wrong. I promptly ignored this advice and attended every talk I could, but now I get what they meant.

    Early on in your research career, it makes sense to go to talks. You know less about the field and you know fewer people. As you become more senior, it makes less sense to go to talks. It’s more likely you know a bit about the topic, and you know more people, so the value of talks go down compared to the research conversations you could have instead. Conference organizers know this. Ever wonder why there are so many coffee breaks, and why they’re all much longer than they’d need to be if people were just getting coffee? Important, valuable meetings are happening during those coffee breaks.

    In the limit, people attend conferences to meet up with the people they only see at conferences. As someone from the Bay Area, the running joke is that we travel halfway across the world to talk to people who live an hour’s drive away. It’s not that we don’t want to talk to each other, it’s that the conference environment provides a much lower activation energy to scheduling meetups, and it’s easier to have serendipitous run-ins with old friends if we’re all in the same venue.

    In this model of a research conference, all the posters, accepted papers, talks, and so on are background noise. They exist as the default option for people who don’t have plans, or who want a break from socializing. That default option is critically important to keeping everything going, but they’re not the point of the conference. The point of the conference is for everyone in the research community to gather at the same place at the same time. If you’ve been to fan conventions, it’s a very similar dynamic.

    If you take this model as true, then NeurIPS’s unofficial status as the biggest ML conference is incredibly important. If you could only go to one conference each year, you’d go to NeurIPS, because everyone else is going to go to NeurIPS.

    And if NeurIPS is the place to be, shouldn’t NeurIPS be as big as necessary?

    * * *

    Well, maybe. NeurIPS attendance is growing, but the growth is coming from different places.

    Year over year, NeurIPS has been growing way faster than any of the PhD programs that could be feeding into it. I would guess it’s growing faster than the undergrads and master’s students as well. If the growth isn’t coming from universities, it has to be coming from industry and the broader data science community - a community that is much larger and of a different makeup than the traditional ML research crowd.

    I said NeurIPS is about networking, but the question is, networking between who? It started as networking between researchers, because the makeup of attendees started as researchers. It’s been shifting ever since deep learning hype took off. It is increasingly likely that if you talk to a random attendee, they’ll be an ML enthusiast or someone working in an ML-related role at a big company, rather than someone in a PhD program.

    And I should be really, really clear here: that’s not necessarily a bad thing! But people in a PhD program have different priorities from people working at a big company, and that’s causing a culture clash.

    The size debate is just a proxy for the real debate about what NeurIPS should be. We’re in the middle of an Eternal September moment.

    Eternal September is a term I really wish more people knew about, so here’s the short version. There used to be this thing called Usenet, with its own etiquette and social norms. Every September, new students from colleges and universities would get access to Usenet, and they’d stir a fuss, but the influx was small enough for existing Usenet culture to absorb them without much change. Then, AOL opened Usenet access to anyone who wanted it. Usenet culture couldn’t integrate the firehose of interest, and it became known as the Eternal September. The original Usenet culture disappeared, in favor of whatever culture made sense for the new users.

    The parallels to NeurIPS are uncanny. A simple find-replace exactly describes what’s happening now, from the people saying NeurIPS is turning into a spectacle, to the people complaining they can’t buy tickets to a conference they really want to attend.

    Despite their foreboding name, Eternal Septembers are not inherently bad. They are what they are. But generally, they’re good for people trying to join, and bad for people that are already there and like what they have.

    So the real question is, who is NeurIPS for? Is it for the established researchers to talk shop? The newer researchers trying to present their work and build a career? The data scientist looking for new applications of ML research? Right now, it’s for all of them, and the organizers are doing their best to balance everyone’s interests correctly, which is an incredibly difficult job I wouldn’t wish on anyone. The one thing that seems clear to me is that a pure, academic-only NeurIPS untethered from industry is never going to happen. Machine learning is currently too economically viable for industry to stop caring about it. You don’t stop Eternal September. Eternal September is something that happens to you. The best you can do is nudge the final outcome the best you can.

    It’s a crazy solution, and I don’t know if it even makes sense, but maybe NeurIPS needs to be split in two. Have one act as the submission venue, where people submit and present their research, with heavier restrictions on who’s allowed to attend, and have the other act as the open-to-everyone conference, with the two co-located to encourage some crossover. If NeurIPS’s growing pains are caused by it trying to be something for everyone, then maybe we need to split NeurIPS’s responsibilities. Except, I don’t actually know what that means.

    I do believe that it’s something people should be thinking more about. So, consider this as a call to action. September approaches, and thinkpieces or blog posts aren’t going to change what happens when it does.

  • Brief Update on AlphaStar Predictions

    Around the end of October, the Nature paper for AlphaStar was released. The Nature version has more significant restrictions on APM and camera movement, is able to play as all three races instead of just Protoss, can play the same maps that humans play on the competitive ladder, and reached Grandmaster level. However, it hasn’t done the “go 50-0 against pros” that AlphaZero did. It won some games against pros, but lost games as well. The rough consensus I’ve seen is that AlphaStar still has clear gaps in game knowledge, but is able to win anyways through very good mechanics.

    I haven’t read the Nature paper yet, and I’m not going to do a detailed post like I did last time, since I’ve been a bit busy. This is just to follow-up on the predictions I made. I believe that if you make a public prediction, it’s very important to revisit it, whether you were right or wrong.

    I made two sets of predictions. The first set came from February 2019, shortly after the TLO and MaNa showmatches.

    If no restrictions are added besides no global camera, I think within a year AlphaStar will be able to beat any pro, in any matchup, on any set of maps in a best-of-five series.

    If [APM restrictions] are added, it’s less likely AlphaStar will be that good in a year, but it’s still at least over 50%.

    Since restrictions were added, the first prediction isn’t relevant anymore. As for the second prediction, there are still a few months before Feburary 2020, but I’m a bit less confident. Maybe before I was 55% and now I’m more like 45%, and this is assuming DeepMind keeps pushing on AlphaStar. I’ve honestly lost track of whether that’s happening or not.

    In July 2019, I made this comment:

    In AI for Games news, Blizzard announced that AlphaStar will be playing on the live SC2 competitive ladder, to evaluate agent performance in blind trials. AlphaStar can now play as all three races, has more restrictive APM limits than their previous version, and has been trained to only observe information from units within the current camera view. Should be fun - I expect it to beat everyone in blind games by this point.

    This prediction was rather decidedly wrong. AlphaStar was very strong, but was not stomping every opponent it played against. Here, my mistake was overcorrecting for announcement bias. Historically, DeepMind only announces something publicly when they’re very confident in its result. So, when they disclosed that AlphaStar would be playing games on the EU ladder, I assumed that meant it was a done deal. I’m now thinking that the disclosure was partly driven by legal reasons, and that although they were confident it was worth doing human evaluation, that didn’t necessarily mean they were confident it would beat all pro players. It only meant it was worth testing if it could.

    Two concluding remarks. First, I personally found it interesting that I had no trouble believing AlphaStar was going to be ridiculously superhuman just 6 months after the original showmatch. For what it’s worth, I still think that was reasonable to believe, which is a bit strange given how short 6 months is.

    Second, we got spoiled by Chess, Go, and other turn-based perfect information games. In those games, superhuman game AIs always ended up teaching us some new strategy or new way to view the game, and that was because the only way they could be superhuman was by figuring out better moves than humans could. Starcraft is different. It’s part finding the right move, and part executing it, and that opens up new strategies in the search space. The fact that AlphaStar can win with subhuman moves executed well is less a problem with AlphaStar, and more a problem with the strategy space of APM-based games. If it’s an available and viable option, it shouldn’t be surprising that AlphaStar ends up picking a strategy that works. It’s disappointing if you expected something different, but most things are disappointing when they don’t match your expectations. So it goes.