Ask any engineering leader at a growth stage company what their top priority is, and they’ll likely say hiring. When we think about how big a decision taking a job is for both the company and candidate, the few hours of interviews seems pretty short. We want to make sure our job interview process makes the most of that time to help both candidates and Databricks understand if the role is a good fit. We want to learn about you and make sure you get the information you need to make the best decision. One of the best ways to do this is to design interviews that emphasize conversation and collaboration. Real world problems are messy and complex. We want to understand how candidates solve abstract challenges more than we want to see a specific solution.
What do you want candidates to understand about the data team at Databricks before entering the interview process?
Despite the scale of infrastructure Databricks operates, we have a relatively small engineering organization. We operate millions of virtual machines, generating terabytes of logs and processing exabytes of data per day. At our scale, we regularly observe cloud hardware, network, and operating system faults, and our software must gracefully shield our customers from any of the above. We do all this with less than 200 engineers.
Our size means we have the flexibility to adopt or create the technology we believe is the best solution for each engineering challenge. The flip side of that is there are many parts of our infrastructure that are still maturing, so the set of concerns for many initiatives expands beyond the scope of a single service. It’s also still a startup so the boundaries of ownership and responsibility aren’t always clear. That means it’s easy to make changes and have an impact outside your core focus areas, and that you’ll own much more of a project than you would somewhere else.
What are you going to be a master of after working at Databricks? You will be able to create scalable systems within the Big Data and Machine Learning field. Most engineers don’t do applied ML in their day to day work, but we deeply understand how it’s being used across a range of industries for our customers.
How can you prepare for technical interview questions?
Our engineering interviews consist of a mix of technical and soft skills assessments between 45 and 90 minutes long. While some of our technical interviews are more traditional algorithm questions focused on data structures and computer science fundamentals, we have been shifting towards more hands-on problem solving and coding assessments. Even on the algorithm questions, candidates are welcome to work through the problem on a laptop rather than a whiteboard if they prefer. This helps us get a sense of how they write code in a more realistic environment. For our coding questions, we focus less on algorithm knowledge and more on design, code structure, debugging and learning new domains. For example, some of our technical questions will probably use a language/framework you are unfamiliar with so you’ll need to demonstrate an ability to read documentation and solve a problem in a new area. Other questions involve progressively building a complex program in stages by following a feature spec.
We also adapt our interviews based on the candidate’s background, work experience, and role. For more fullstack roles, we spend more time on the basics of web communication (http, websockets, authentication), browser fundamentals (caching, js event handling), and API + data modeling. For more low level systems engineering, we’ll emphasize multi threading and OS primitives.
I recommend three things to prepare:
- Find coding questions online and practice solving them completely. This means creating full working code and tests without looking at the solution. Creating tests is important; some of our technical questions have several stages, so you’ll want to be able to quickly set up a test harness for a fast edit/compile/debug loop during the interview, just like you would for your day to day work.
- Review computer science fundamentals. Know common data structures, the runtime and memory utilization of each method, and their interface in the language you plan to use. This technical interview handbook on GitHub is a good overview of the different data structures, but you should also study systems concepts like mult-threading, concurrency, locks and transactions.
- Do mock interviews. The time pressure and dialogue of a mock job interview is a great way to get comfortable before the real thing. Have a friend ask you questions you don’t know and hint along the way as needed.
Haoyi on our Dev Tools team wrote a great blog post on how to interview effectively that gives good insight into how we structure our interviews and what we look for.
What are the most common mistakes you see during interviews?
Now that we’ve covered what we look for and how to prepare for interviews, there are a few things you should consciously try not to do during an engineering job interview.
The main one is lacking passion or interest in the role. Remember, you are interviewing the company as well and it’s important you show that you are invested in making a match. Having low enthusiasm, not being familiar with the Databricks product, not asking any questions and in general relying on the interviewer to drive the entire conversation are all signs you aren’t interested. Just as you want an interview process that challenges you and dives into your skills and interests, we like a candidate that asks us tough questions and takes the time to get to know us.
For technical interviews, if a candidate is pursuing a solution that won’t work, we try to help them realize it before spending a lot of time on implementation. If the interviewer is asking questions, chances are they are trying to hint you towards a different path. Rather than staying fixed on a single track solution, take a minute to step back and reconsider your approach with new hints or questions. Remember that your interviewer has probably asked the same question dozens of times and seen a range of approaches. They also want to see how you'd respond in a real-world environment, where you'd be working with a team that offers help in a similar way.
For interviews focused on work history and soft skills, have specific examples. It’s ok to start with broad generalization, but tell a story about how specific examples in your past work history answer the question. When talking about your work experience, try to (1) clearly define the problem, (2) your solution, (3) the outcome and (4) any reflections on improvements. A good way to provide a well thought-out answer is by using the STAR Interview Response Technique.
What are some qualities you’ve seen in successful and impactful engineers on your teams (both in the present and past)?
At a startup like Databricks, the most important quality I’ve seen in successful engineers is ownership. We are growing quickly, which brings a lot of new challenges every week, but it’s not always clear how responsibilities divide across teams and priorities get determined. Great engineers handle this ambiguity by surfacing the most impactful problems to work on, not just those limited to their current team’s responsibilities. Sometimes this means directly helping to build the solution, but often it’s motivating others to prioritize the work.
The second quality we focus on, particularly for those earlier in their career, is the ability to learn and grow. The derivative of knowledge is often more important than a candidate’s current technical skills. Many of the engineering problems we are solving don’t have existing templates to follow. That means continually breaking through layers of abstraction to consider the larger system - from the lowest level of cpu instructions, up to how visualizations are rendered in the browser.
How have I seen these qualities in interviews? Engineers that show a lot of ownership can often speak in detail about the adjacent systems they relied on for past work. For example, they know the strengths and weaknesses of a specific storage layer or build system they used and why. They also often create changes to help their team become more effective - either through tooling improvements or a process change. Growth comes across through reflection on past work. No solution is perfect, and great engineers know what they would do next or do differently. A lot of candidates say the opportunity to grow is their main criteria for choosing their next job, but they should be able to talk about what they are already doing to grow. Maybe that’s a side project, a new technology they recently learned, some improvement to their developer environment, or a mentor relationship they are cultivating in their current role.
What are some of the problems your team is working on? What skills are you looking for that will make candidates successful with these problems?
The Workspace team has a pretty broad set of product use cases to support and most of the team works full stack. We look for generalists who have shown an ability to quickly learn new technologies. We are also very customer facing and need engineers that can dig deep to understand our users to formulate requirements. Several of the team members either had their own startups in the past or worked as early employees at startups.
One of the best ways to understand a role is to ask, “What will I become a master of?” For the Workspace team it’s three main skills.
- Quickly learning new technologies. The Workspace team does a lot of exploratory and prototype work. The team has many generalists who need to combine product sense and an ability to adapt existing technology to novel problems. A good example is adapting open source Jupyter to run in Databricks hosted cloud with Databricks Clusters. Another one is creating a pub/sub infra to stream updates through a GraphQL API to realtime web clients.
- The workflows around data science, machine learning and data analytics. We are building products for those personas so you will intimately understand the daily workflow of data scientists and data engineers at a variety of customers across many industries and company sizes. Engineers on this team have regular exposure to our customers and internal customer champions in Field Engineering.
- Scalable web service design on the JVM. Our team works on the core backend for the stateful notebooks and Workspace, which often faces design challenges unique to a service at our scale. Everyone on the team develops a deep understanding of resource primitives (cpu/memory/io/network) and how to optimize their usage in a distributed fault tolerant architecture.
At Databricks, we are constantly looking for Software Engineers who embody the characteristics we’ve talked about. If you are interested in solving some of the challenges that we are currently tackling here, check out our Careers Page and apply to interview with us!
Ted Tomlinson is a Director of Engineering at Databricks. He manages the Workspace team, which is responsible for Databricks' flagship collaborative notebooks product and the services used to enable interactive data science and machine learning across environments.