Monday, July 9, 2018

My Recent Technical Interview Experience

I switched software engineering jobs earlier this year, mostly because I felt I wasn't growing enough at my previous company. It was very stressful, but I'm happy with how it turned out. Here are some assorted observations on the process of interviewing at different companies, written in haste right afterward and edited briefly some months later.

Recruiting Companies

I mostly went through two different recruiting companies for my search, TripleByte and Refdash.

TripleByte

Triplebyte was a great experience overall, in that they were respectful and kind, and got me onsite at a ton of companies very rapidly. And I do think they correctly optimized for interview/offer ratio in my case: I got offers from every company I had an onsite with through TripleByte.

The main downside was that I also wanted to filter for companies that paid a bunch more money, and they didn't do a very good job of filtering for that on top of the interview type. I would have preferred to do more interview prep on whiteboarding and algorithms and then interview at those companies, rather than only ending up with offers from companies not in the pay range I was looking for.

My Triplebyte recruiter lowballed me on salary - they suggested, if I were to mention a salary range, it should be $135k-150k. I didn't mention a range, and the company offered me $165k base. This was still lower than I was hoping, but substantially higher than the range the recruiter suggested. Also, I worked at a BigCo, and any of the TripleByte offers would have been a 50% pay cut or more for me. Obviously this wasn't malicious on their part, but I feel it reflects a lack of an accurate sense of the market for engineers at larger companies. Or maybe they simply don't have or didn't think I was qualified for higher-paying jobs. Regardless, I did not end up taking any offers from the companies I interviewed with through TripleByte.

Scheduling interviews and phone calls with TripleByte was a breeze. Their web app is extremely slick. No back-and-forth emails about "Well, Wednesday doesn't work for me, how about Thursday?" It removed a lot of stress from the process for me.

Refdash

Overall, Refdash was... okay. I did end up taking an offer through them.

They're still starting out, and it seems that many companies don't yet trust them with their phone interviews (unlike TripleByte, which is more established). So I ended up having to do several phone interviews in addition to Refdash. That was a huge pain. Doing phone interviews while still employed is particularly hellish; do I line several up and take the day off? Do I work from home and do one and then work extra time later in the day? Either way sucks, and I had limited ability to do either.

I failed the phone interview at Lyft, through Refdash. I feel that it was a rather silly puzzle/algorithm problem, and I didn't get the "aha" moment I was supposed to in a short enough time.

I did two phone interviews with another company via Refdash. I think I did well on both technically, but they rejected me after the second one. In the second one I asked them a question that, while more tactfully phrased, basically boiled down to "are you personally comfortable with the fact that your business model is evil?" and I suspect the rejection was due to, uhh, 'culture fit'.

I got the sense that my Refdash interviewer didn't respect me, so it was hard for me to work with him. Refdash kept suggesting that I do more interviews to "practice," and I couldn't help but think they didn't think I was a competent engineer. It's true I didn't do well on my second phone interview with them. It seemed like the interviewer was predisposed to think I wasn't doing well, in a way that made me feel anxious and do worse. He came off as very condescending both on the end of that call and on our later call. I don't want to speculate too much as to why, but it does make me a little suspicious that my husband didn't have this problem with them at all (he went through a job search with Refdash shortly before I did).

Scheduling phone interviews and recruiter calls with Refdash was a huge pain compared to TripleByte. This experience has taught me just how much I hate scheduling stuff over email: TREMENDOUSLY. I also really hate phone calls, especially with recruiters and other people I can't communicate with well. In a high-pressure situation, it's much easier for me to talk to other engineers, people who I know speak my language. TripleByte minimized the amount I had to do this, which I loved.

I feel bad that I didn't follow up with some of the companies Refdash connected me with. This job search caused me a tremendous amount of stress (I was hardly eating the whole week I was doing interviews), and I think it was reasonable for me to cut that off after a certain point. The Refdash companies seemed to be better-paying, though, so I wish I'd been able to endure the unpleasantness of interacting with the Refdash interviewers and doing extra phone interviews and calls.

On the plus side, Refdash did not lowball me on salary and gave me incredibly useful assistance with negotiation that resulted in my final job offer increasing by over 20%. Which I accepted. So I can't complain too much.

The Interview Experience

I took a whole week off to do five onsite interviews in a row. TripleByte made this pretty easy; 3/5 were through them, and the rest I managed to schedule quickly. This was a great decision. I had rehearsed various spiels and had all the relevant information in my head, and so there was a lot less context switching than there would have been if I'd been working in between.

I was surprised at how many companies asked me to write code on a laptop, "pair programming" style, or just do design interviews, with no standard algorithms-type whiteboarding. I suspect this is because Triplebyte slotted me into companies that do that type of interview.

Pair programming interviews were generally quite easy for me. Being able to actually run code and debug in an interview is amazing, and I wish more places allowed me to do it. Of course, "pair programming" in an interview generally means "they watch you write code and talk to you about it," rather than an actual collaboration, but it's still better than whiteboarding.

I did the Cracking the Coding Interview project grid to prepare: various behavioral question prompts for each project on your resume. As before, this was extremely helpful when I had to answer questions about my resume and previous projects.

My last onsite involved a lot of whiteboarding. I was surprised, since the recruiter told me to bring a laptop. By this time, though, I was so used to writing code under pressure that I think I did quite well. (I'll find out what my interviewers thought tomorrow. I got and accepted an offer from this company!)

I have a lot of trouble forcing myself to actually prepare for interviews, and still haven't figured out exactly why this is. Perhaps because I haven't done it in the past and so my System 1 thinks it's unnecessary. "Actually prepare" here means: 1) do problems on Leetcode or similar, 2) study basic Computer Science. I'm a fast learner, and I've picked up enough of (2) by now to do okay on that portion of interviews. But I don't enjoy the sort of "aha" "puzzle" problems that are still common at many companies, and so I don't like doing them to practice either.

What I learned about software companies

I do feel that this job search has given me a better sense of the market and what to look for, as well as good practice talking about what I've been doing at my current employer and so on. Even if I don't end up moving right now, I think I will in the future, and this is good practice. (I did end up switching employers.)

It shouldn't be surprising, but I was still surprised by how little the companies I interviewed with seem to adhere to good engineering practices. It seems there is a lot of low hanging fruit there. In particular it seems like no one knows how the hell to do SRE, which is hilarious coming from someone with as little experience as me. Most places seem to do universal-ish code reviews these days, which is nice. (Not all though!)

Everyone who is hiring will always say "yes, we have ton of code to write! I wish we had the problem of being over-resourced!" That's what my last employer said when I joined. They were wrong. Well - they weren't right for long, at least. Figuring out what to do next seems to always be a bottleneck on development, far more than the actual writing of the code and making of the thing to go. So it's a valuable skill, one that I'm still working on developing.

Asking good questions is tough, but I got better at it over time.

  • "How has this place changed since you joined?" didn't seem to go very well. People would usually tell me some generic stuff about how they've gotten bigger, expanded into new markets, etc., which isn't very helpful for getting a sense of the culture. I occasionally had time to follow up with "How has your day-to-day changed?" which was more illuminating; it's possible you could frame this question to get those answers upfront.
  • "What are your hours?" is nice to ask a couple people, just to get a sense.
  • "How much time do you spend in meetings?" is a bit of a leading question, and will usually end with the person concluding "... not too many." It helped to ask "what are your regularly scheduled meetings?" - this ties in nicely to questions about project planning and so on.
  • "What is something you find difficult about working here?" is nice - a different phrasing of the "tell me something negative / what would you change" question. I like this one because it doesn't restrict them to thinking about things that could be changed - it also illuminates issues that aren't really fixable or shouldn't be fixed, e.g. "our product handles sensitive data and the security requirements are a bit painful."
  • "How does this compare to other places you've worked?" - Good because it gets at specifics, and also handy to get a sense of where your interviewers were before they were here. Asking where they worked before also puts their answers in perspectives. E.g. "we have a really scrappy culture!" means less from someone who worked at Google than someone who worked at Facebook.
  • "How do you collaborate with your team?" - e.g. do you talk to them on Slack, turn around to the person behind you, send emails ... ? More specifically: "What do you do if you're stuck on a technical issue?" This is a great barometer of the team dynamics.

Conclusion

Looking back, I realize that I'd almost blocked out of my memory how incredibly stressful the interview experience was. I think next time I switch jobs, I will seriously consider quitting first and interviewing later.

Changing jobs has also helped me realize how valuable the skills that I had at my previous job are. I hadn't expected so much of it to be transferable, but it turns out having a sense of good practices and "how things are done" in a variety of respects can give you a really good framework for development, which I now realize not all software developers have.


Was the job change worth it? Probably. Some months in, I'm writing a lot more code and using my brain a lot more than I was at my previous job. So I'm happy now. I won't be interviewing again for some time.

Thursday, July 5, 2018

Algebraic Data Types for Imperative Programmers

Algebraic data types are a fantastic way to make code clearer. They're used all the time in Haskell, because they're a fundamental part of the language, easy, and lightweight. But it turns out you can use them in imperative programming languages like Python and C++, too, with just a bit of hackery.

What's an algebraic data type?

Having "algebraic data types" means that you can create a new type by doing AND or OR on two existing types.

In most languages, doing an AND is easy. In this context, AND just means that "to get an item of this new type, you need an item of a building block type AND another item of a second building block type." For example, in C++, you can use a pair or a struct:
std::pair<int, string> person_description = { 50, 'Walter White' };
"To get a person description, you need an integer (age) and a name (string)."
In Python, we can use a tuple:
person = (50, 'Walter White')
In both of these cases, to fill out the data type, we need an integer AND a string.

However, most languages don't make it easy to represent an OR. Suppose you want to represent an animal that could be of multiple types, and would have different properties depending on the particular animal. In Haskell this looks like this:
data Cat = Cat { whiskerLength :: Float }
data Dog = Dog { tailLength :: Float, noseWetness :: Float }
data Pet = Feline Cat | Canine Dog
We're declaring that there's a thing called a Cat, and a Pet can be Feline, in which case it has all the properties of a Cat, or it can be Canine, in which case it has the properties of a Dog.

In C or C++, declaring the Cat and Dog structures is easy:
struct Cat {
  float whisker_length;
}
struct Dog {
  float tail_length;
  float nose_wetness;
}
But declaring a type that could be "a cat OR a dog" is not. Most popular programming languages don't natively support the concept of an OR type, so you can't create any arbitrary algebraic data type with them.

In a case like this, we would often just create a Pet structure that has both a Cat and Dog, and try to enforce in the functions we write for Pet that only one at a time is ever filled out.
struct Pet {
  // Only one should be non-null.
  Cat* cat;
  Dog* dog;
}
But if we don't make a bunch of tedious if Cat checks, then we might make a mistake:
void totalLength(const Pet& pet) {
  // Cats are 20 cm long on average, plus whiskers.
  return 20.0 + pet.cat->whisker_length;
  // BOOM! If it's a Dog, we done goofed.
}

Whereas if we had an algebraic data type, the compiler would object to us doing this.
totalLength :: Pet -> Float
totalLength pet = 20.0 + (whiskerLength pet)
-- ERROR: whiskerLength works on a Cat, not on a Pet.

Instead we have to use pattern matching:
totalLength (Cat cat) = 20.0 + (whiskerLength cat)
-- This obviously only works with a Cat.
... and if you have warnings turned on, the compiler will warn you that this is a partial function that crashes on Pets that are a Dog.
Another solution to this might be object inheritance.
class Pet {
 public:
  virtual float length() = 0;
}
class Cat : public Pet {
  float whiskerLength;
  override float length() {
    return 20.0 + whiskerLength;
  }
}
class Dog : public Pet {
  float tailLength;
  float noseWetness;
}

This approach also has the benefit of being compiler-checked; if you try to create a Dog object, it will fail saying that you forgot to implement length() for Dog, and it's therefore a pure virtual class. On the minus side, if you have a Pet object in a function, it's hackier to determine whether it's actually a Dog or a Cat under the hood. In fact, you have no way to enumerate all the possible types of Pet and handle them differently in a later function; all the differing logic has to be encoded in the class itself.

Why should you use algebraic data types?

Algebraic data types make it clearer to express lots of concepts, and as a bonus, they can help guarantee that your code does what it should.

Everyone uses AND data types - packaging data into named structures is seen as common sense. But while OR data types are used less commonly, they also can make code much clearer.

Here's an example. Suppose we have a program for interpreting a handwritten phone number from an image of a scanned form. If we get back text from the form, we want to try to turn it into someone's phone number. But it's possible someone filled out the form incorrectly, with their name in the "Phone Number" field. So the text may be invalid.
So we need to be able to represent three cases:
A phone number is an integer.
data PhoneNumber = PhoneNumber Integer
InvalidText is an object with a the original text and an error message.
data InvalidText = InvalidText { originalText :: String, errorMessage :: String }
An image error is an error that results from finding no text in the image.
data ImageError = ImageError
And to represent "one of these things," we create a new algebraic data type. A PhoneNumberOrError is either a phone number, invalid text, or an image error.
data PhoneNumberOrError = PhoneNumber | InvalidText | ImageError
We have a function to get a PhoneNumberOrError from an Image.
getPhoneNumber :: Image -> PhoneNumberOrError

This is nice, because now we know we have to handle all the resulting cases in downstream code. If we try to write a function that doesn't handle those cases, we get an error.
callPhoneNumber :: PhoneNumber -> IO ()
callPhoneNumber = ...

callWrittenPhoneNumber :: Image -> IO ()
callWrittenPhoneNumber image = do
    callPhoneNumber (getPhoneNumber image)
    -- COMPILE ERROR: PhoneNumberOrError is not a PhoneNumber.
Instead we define this function, and use pattern matching to check the different cases:
maybeCallPhoneNumber :: PhoneNumberOrError -> IO ()
"If it's a phone number, call it."
maybeCallPhoneNumber (PhoneNumber num) = callPhoneNumber num
"If it's an image error, print that out."
maybeCallPhoneNumber ImageError = printStrLn "Error processing text from image."
"If it's invalid text, print it out, and also the error message."
maybeCallPhoneNumber invalid_text = printStrLn (
    "Found invalid text: '" ++ originalText invalid_text ++
    "'. Error message: " ++ errorMessage invalid_text)
And now we can define our callWrittenPhoneNumber function:
callWrittenPhoneNumber :: Image -> IO ()
callWrittenPhoneNumber image = maybeCallPhoneNumber (getPhoneNumber image)

If we were doing this in a traditional imperative way, this would look very different, depending on the programming language. Perhaps, in C++, we'd return an integer that could be a phone number or an error code:
// -1 means no phone number was found; see error message for details.
int getPhoneNumber(const Image& image, string* error_message) { ... }
We left a nice comment on our function, but what if the next engineer was in a hurry and forgot to read it?
void callPhoneNumberInImage(const Image& image) {
    string error_message;
    int phone_number = getPhoneNumber(image, error_message);
    callPhoneNumber(phone_number);
}
... oops! Now we've tried to call the number -1, because it wasn't enforced that we should handle all the possible cases.

And if we've got to write down what our function returns anyway, why not make it machine-readable, too? Plus, the error message is opaque, and can't tell us which type of thing went wrong.

Why type safety?

Why do you need union types when you can just return whatever you want, like in Python? Well...
class Dog:
  def __init__(self, tail_length, nose_wetness):
    self.tail_length = tail_length
    self.nose_wetness = nose_wetness

class Cat:
  def __init__(self, whisker_length):
    self.whisker_length = whisker_length

def length(pet):
  return pet.whisker_length + 20.0  # OOPS. Doesn't work for Dog.


  • The return type of the function isn't checked by the compiler. Anyone can add a return statement with any old type, and break your assumptions about the thing you got back from that function.
  • With duck typing, it can be hard to track down exactly where the method and attribute you're trying to use gets populated, or whether it applies in each case.
In contrast, with ADTs, the return type is well-specified, and you know which category of thing you got back from the function.


How do you use them?

In Haskell, writing an ADT is simple. But how do you use them in languages that don't support them natively?

There are a few different monkey patches available at this point.

Protocol Buffers and friends

Here's one way: Protocol Buffers. While they are mainly used as an encoding and RPC format, because of their support for the oneof keyword, protocol buffers are actually algebraic data types as well!

Here's how we could write the exact same code in C++ and proto, using protos to represent our ADT:
phone_number.proto
message InterpretedPhoneNumber {
  oneof result {
    int64 phone_number = 1; // That's the right data type for phone numbers, right?
    string invalid_text_error = 2;
    string image_error = 3;
  }
}

call.cpp
#include "phone_number.proto.h"

InterpretedPhoneNumber getPhoneNumber(const Image& image) { ... }

void callPhoneNumberInImage(const Image& image) {
  auto maybe_phone_number = getPhoneNumber(image);
  switch (maybe_phone_number.result_case()) {
    case InterpretedPhoneNumber::kPhoneNumber:
      callPhoneNumber(maybe_phone_number.phone_number());
      break;
    case InterpretedPhoneNumber::kInvalidTextError:
      cout << "Error in phone number text: "
           << maybe_phone_number.invalid_text_error() << endl;
      break;
    case InterpretedPhoneNumber::kImageError:
      cout << "Error in phone number image: "
           << maybe_phone_number.image_error() << endl;
      break;
    default:
      // This should never happen. Must be enforced by 'getPhoneNumber'.
      cout << "Invalid interpreted result" << endl;
  }
}

With this approach, you can get the compiler to warn you if you haven't included all the cases, because the oneof generates an enum that can be checked in a switch statement like any other. It's also clear what the possible types of result are; it's encoded in the type names, rather than a comment.

Caveat: Protocol buffers unfortunately may have significant performance overhead, so in speed-critical applications, they may not be appropriate if all you need is an ADT.

Similarly to Google's Protocol Buffers, Cap'n Proto can also support ADTs, using named unions.

Other ways to get ADTs in non-Haskell languages

I'm not as familiar with these tools, but here's a list of things I've heard can help you use ADTs in languages that don't support them natively:

  • C++17 supports std::variant, a template that can be used for creating un-tagged union types. 
  • TypeScript, a wrapper on Javascript, supports ADTs.
  • Pytype, a Python library for adding type checking, has Union[] and Optional[] types that you can return.
  • C does have a feature called "unions." However, they're not great, because you can't tell which data member is filled out by looking at the union. You have to just try and hope that you got it right.
Are there other tools I'm missing? Or other solutions to this kind of problem?

Thursday, September 28, 2017

Save Your Work

Here's a useful habit I've picked up as a software engineer. Every time you do something difficult, create a reproducible artifact that can be used to do it more easily next time, and shared with others.

Some examples of this:

  • You spent all afternoon debugging a thorny issue. Write down the monitoring you checked and the steps you took to reach the conclusion you did. Put these details in the issue tracker, before moving on to actually fix it.
  • You figured out what commands to run to get the binary to work properly. Turn the commands into a short script and check it into source control.
  • You spent a day reading the code and figuring out how it works. Write yourself some notes and documentation as you go. At the end, take half an hour to clean it up and send it to your boss or teammates who might find it helpful. Maybe even put up a documentation website if that seems appropriate.

This makes it easier to pick up where you left off for next time (for you or someone else), and makes it easier to prove that the work you're doing is difficult and has value.