## Congratulations to Ed Chi and Patrick Baudisch!

GroupLens would like to congratulate Ed Chi (a Ph.D. graduate from our lab) and Patrick Baudisch (a former visiting graduate student to our lab) on being named ACM Distinguished Scientists! We wish them all the best and are very proud of their continued accomplishments!

## Preparing for a Google technical interview

GroupLens students and alumni successfully interview at Google on a regular basis. Several current GroupLens students have interned at the company, and our alumni have become Google research scientists and software engineers. I collected the following technical interview preparation tips from Google recruiters and engineers. Please ask your recruiter if you need confirmation of anything below, as the interview structure changes over time. This advice applies to other companies that are interested in similar problems or which hire the same types of engineers. And if you’re interviewing for an internship instead of a full-time job, there will be a different standard; not necessarily harder or easier, just different, based on the type of student you are and your other interests.

Technical interviews are each about 45 minutes long. There is no dress code. You will code on a whiteboard, showing the interviewer your thought process by talking through decisions and assumptions. Occasionally a video chat and collaborative document allows you to interview from a distance, or a piece of paper substitutes for the whiteboard during in-person interviews. Interview topics may cover anything on your résumé, especially where you claim expertise. Fundamental computer science knowledge is required for all engineering roles at Google and will form the basis for almost all interview questions. Google wants to see if you can take a hard, big problem for which you don’t know an obvious solution, and break it down into manageable solvable parts for which you can provide reasonable runtime and space bounds.

# Approaching Technical Interview Problems

## Talk and listen!

Google is interested in your problem-solving approach. This means talking and asking questions. The interview questions will be in-depth, and Google wants to see how you think about complicated problems. Correct answers are often not necessary if you’ve shown a mature thought process. It’s okay and encouraged to question your interviewer. Ask for help when you need it. If you need to assume something, verbally check with your interviewer that it is a correct assumption! These questions should be problems you haven’t come across before, and in fact, the ethics of technical interviewing require you to tell the interviewer if you’ve seen a problem already.

Continue talking! Don’t let an opportunity go by to show the interviewer how you think. That way, if you’re stuck, or if you have talked about the right solution or path but discarded it, your interviewer may provide a hint. Continue listening! Communication throughout the interview is key. Not only will you improve your chance of solving the problem and show the interviewer your thoughts, but the interviewer will also judge the teamwork by which you both get to the solution. A Google engineer is not only a person who arrives at an optimal answer, but also a person who can arrive at that answer collaboratively and explain it clearly.

As an aside: there will likely be parts of the interview where you’ll chat with the interviewer about your skills, interests, education, career, and projects. Remember to be genuine and interesting. Talk about your passions. Listen when the interviewer tells you what team she’s on at Google; you might want to follow up with an appropriate anecdote from your own life. The interviewer is likely asking herself, “would I want to work next to this person for several years?” These are often called soft skills. Learn and practice them if necessary.

## Code well

Define the problem in parts, noting your assumptions (for instance, input size or available utility classes) and ideas about possible edge cases. Most questions should be able to be answered in fewer than 20 lines of code, almost always fewer than 30. Many solutions are simple struct classes, sufficient to do the job but not too generic. Use your strongest programming language with no pseudocode. Pick good variable names. Run through the code yourself and clean up bugs right away. Ensure it works by using examples. Try to find edge cases. Watch for off-by-one errors!

Questions will be progressive; for instance, how does your solution scale to an input of a size of several billion? Your solutions should also be progressive: first show a simple solution and then talk about the ways in which you could optimize your solution. You should be able to discuss pros and cons of each optimization choice, and the interviewer may jump in to ask you to solve a particular one.

Don’t guess an answer. Keep solutions simple but avoid brute force to solve the problem. If you choose a complex solution, the interviewer has many different ways to poke holes in it.

Though you likely won’t have to prove your solution’s order of complexity, you must be able to know or approximate it well. Avoid exponential time and space; try to solve the problems in linear or log time.

Recruiters note: “Be quick to comprehend and solve problems. Enjoy finding multiple solutions before choosing the best one. Seek out new ideas and methods of tackling a problem. Be inventive and flexible in your solutions and open to new ideas. Move up to more complex problem solving.”

# Technical Interview Topics

I quote below from Google recruitment emails.

## Coding

Know how to “construct and traverse data structures, implement system routines, distill large data sets to single values, and transform one data set to another”. “You should know at least one programming language really well, and it should preferably be C++ or Java. C# is OK too, since it’s pretty similar to Java.” “You will be expected to write some code in at least some of your interviews. Google is keen to see really high-quality, efficient, clear code without typing mistakes. Because all engineers (at every level) collaborate throughout the Google code base, with an efficient code review process, it’s essential that every engineer works at the same high standard.”

## Algorithm Design and Analysis

Know “string parsing, Big-O analysis, sorting and hashing, searching, and handling obscenely large amounts of data”. “If you struggle with basic big-O complexity analysis, then you are almost guaranteed not to get hired.” “You should study up on as many other data structures and algorithms as possible. You should especially know about the most famous classes of NP-complete problems, such as traveling salesman and the knapsack problem, and be able to recognize them when an interviewer asks you them in disguise.”

### Sorting

“Know how to sort. Don’t do bubble-sort. You should know the details of at least one n*log(n) sorting algorithm, preferably two (say, quicksort and merge sort). Merge sort can be highly useful in situations where quicksort is impractical, so take a look at it.”

## Data Structures

Know hash tables, vectors, graphs, and how to combine data structures, such as finding the median of an array of strings.

### Hash tables

Hash tables are “arguably the single most important data structure known to mankind. You absolutely should know how they work. Be able to implement one using only arrays in your favorite language, in about the space of one interview.”

### Graphs

“Graphs are really important at Google. There are 3 basic ways to represent a graph in memory (objects and pointers, matrix, and adjacency list); familiarize yourself with each representation and their advantages and disadvantages. You should know the basic graph traversal algorithms: breadth-first search and depth-first search. Know their computational complexity, their tradeoffs, and how to implement them in real code. If you get a chance, try to study up on fancier algorithms, such as Dijkstra and A*.”

### Trees

“Know about trees; basic tree construction, traversal and manipulation algorithms. Familiarize yourself with binary trees, n-ary trees, and trie-trees. Be familiar with at least one type of balanced binary tree (such as a red/black tree, splay tree or AVL tree) and know how it’s implemented.” Understand breadth-first and depth-first tree traversal algorithms and “know the difference between inorder, postorder and preorder.”

## Mathematics

“Some interviewers ask basic discrete math questions. This is more prevalent at Google than at other companies because Google engineers are surrounded by counting problems, probability problems, and other Discrete Math 101 situations. Spend some time before the interview refreshing your memory on (or teaching yourself) the essentials of combinatorics and probability. You should be familiar with n-choose-k problems and their ilk, the more the better.”

## System Design

Know “feature sets, interfaces, class hierarchies, designing a system under certain constraints, simplicity and robustness, and tradeoffs”. Be aware of Google’s distributed systems, including Google File System, Bigtable, and MapReduce.

## Operating Systems

“Know about processes, threads and concurrency issues. Know about locks and mutexes and semaphores and monitors and how they work. Know about deadlock and livelock and how to avoid them. Know what resources a process needs, and which a thread needs, and how context switching works, and how it’s initiated by the operating system and underlying hardware. Know a little about scheduling. The world is rapidly moving towards multi-core, so know the fundamentals of modern concurrency constructs.”

## Open-Ended Discussions

As you write a résumé and think about how to talk about yourself, know that Google cares about projects and completion against goals. Make sure you document your work well. Internal candidates (such as interns interviewing for full-time positions) are judged on project outlines and design documentation, committed code, and performance against goals. As an external candidate, you ought to explain exactly how your work contributed to measurable important improvements and outcomes. Your teamwork and communication skills, as well as general company culture fit (“Googleyness”), are important.

# Sources

• I was an intern at Google in the summer of 2014. During this time I spoke with several Google software engineers who conducted enough interviews to be (in Google parlance) “well-calibrated”. The books listed above are their suggestions, as are the non-quoted parts of the text. There is nothing secret or confidential about these tips; I just went to the trouble of organizing everything.
• Detailed explanation of possible interview topics are direct quotes from a Google recruiter, in preparation for my own technical interviews, and recruiter colleagues who schedule on-campus interviews. The blog posts, YouTube videos, and other websites linked above, as well as the “Review of Basic Algorithms” and “Programming Interviews Exposed” books, are the recruiters’ suggestions. All quoted text above comes from public recruitment documentation.

# Contact

## Recommending for new users is surprisingly difficult

For years MovieLens has required users to enter 15 ratings before they are allowed to get personalized recommendations. This design makes sense: how can we make recommendations for a user we know nothing about? That said, we don’t know if this provides users with the best experience. Why should users have to enter fifteen ratings, why not ten or five? What would happen if we let users into the system without any ratings? To answer these questions we need to understand how our algorithms behave for users with very few ratings.

To understand how algorithms behave for users just joining the system, we looked at historic MovieLens ratings. We trained three popular recommender algorithms: ItemItem, UserUser, and SVD on this rating data. While training, we limited some users to have only a small number of ratings. We used the ratings that were not given to the algorithm to measure several things:

• How accurate are the predictions? Can the algorithm accurately predict the user’s future ratings?
• How good are the recommendations? Does the algorithm suggest movies for the user that the user would like?
• What type of recommendations does the algorithm generate? Is there a good diversity of movies? Are the movies popular, or more obscure?

## Evaluating an External Recommender written in Python using LensKit

In this article, we show how to use LensKit to evaluate a recommender written in Python.  We wrote this article to help people who want to use LensKit’s built-in evaluation capabilities and comparison algorithms, but don’t want to implement their own algorithms in Java.  Evaluating an external recommender — whether in R, Python, or MatLab, involves three primary steps:

• Writing the recommender. We will need a simple recommender written in language other than Java (Python in this case) that can take test data to build up a simple model and generate recommendations for a given list of test users.
• Setting up a shim class. We will need to write a small class that teaches LensKit how to use our external algorithm.
• Setting up LensKit evaluation. Finally we show how we setup an experiment using the shim class in a LensKit eval script to evaluate the external recommender.

Note, that the data we will use to test this recommender is a MovieLens rating dataset. The data consists of movie ratings with each row being <userId,itemId,rating>. You can read more about the dataset here. (more…)

## A Tale of Cities: Urban Biases in Volunteered Geographic Information

Yesterday, we posted a write-up to the excellent “Follow the Crowd” blog about our ICWSM 2014 paper that looks at urban and rural participation in social media.

We encourage folks to check out the post!