We’re growing fast and attracting the best talent in the world. Bricksters — as we call ourselves — are a special mix of smart, curious, quick thinkers. If you ask a Brickster what they love about working here, you’ll likely hear about our culture.
We are seeking an experienced NOC Engineer to join our team. The successful candidate will be responsible for monitoring critical Databricks’ infrastructure and developing monitoring tools and alerting dashboards. They will also work closely with stakeholders to investigate and resolve incidents, perform root cause analysis, and propose solutions to increase the reliability and stability of the Databricks Intelligence platform.
The impact you will have here:
- Monitor critical infrastructure, triage alerts to proactively identify incidents, and work with stakeholders to resolve incidents.
- Investigate incidents and propose solutions to improve platform reliability and stability.
- Perform root cause analysis for reoccurring incidents and provide proactive solutions.
- Develop toolings or automate processes to improve platform monitoring and alerting.
- Contribute to software development efforts to improve overall service reliability and stability.
- Communicate effectively with internal stakeholders, including executive staff, to provide incident analysis.
- Participate in war rooms and temporary communication channels during outages.
- Demonstrate cross-functional leadership and establish ownership of incidents and outages.
- Possess the ability to multitask on several incidents and/or projects at once
What are we looking for?
- Minimum of 3 years of experience as a NOC, SRE, or DevOps engineer
- Strong knowledge of cloud technologies such as Azure, AWS, and GCP
- Hands-on experience with monitoring, logging, and alerting tools such as ELK, Prometheus, Grafana, Pager Duty, etc.
- Hands-on experience with containers and orchestration technologies such as Docker and Kubernetes.
- Strong software development skills and experience
- Proficiency in automation and scripting (Python, Bash, Terraform)
- Understanding of CI/CD principles
- Linux systems administration skills.
- Incident management experience.
- Excellent communication skills.
- Linux systems administration skills.
- Ability to work well under pressure in a fast-paced environment
- A high degree of integrity, accountability, attention to detail, execution, and planning expertise
- Bachelor's degree in Computer Science or a related field
- Willingness to learn Databricks products
- Benefits allowance
- Employee's Provident Fund
- Equity awards
- Gym reimbursement
- Annual personal development fund
- Work headphones reimbursement
- Business travel insurance
Databricks is the data and AI company. More than 10,000 organizations worldwide — including Comcast, Condé Nast, Grammarly, and over 50% of the Fortune 500 — rely on the Databricks Data Intelligence Platform to unify and democratize data, analytics and AI. Databricks is headquartered in San Francisco, with offices around the globe and was founded by the original creators of Lakehouse, Apache Spark™, Delta Lake and MLflow. To learn more, follow Databricks on Twitter, LinkedIn and Facebook.
Our Commitment to Diversity and Inclusion
At Databricks, we are committed to fostering a diverse and inclusive culture where everyone can excel. We take great care to ensure that our hiring practices are inclusive and meet equal employment opportunity standards. Individuals looking for employment at Databricks are considered without regard to age, color, disability, ethnicity, family or marital status, gender identity or expression, language, national origin, physical and mental ability, political affiliation, race, religion, sexual orientation, socio-economic status, veteran status, and other protected characteristics.
If access to export-controlled technology or source code is required for performance of job duties, it is within Employer's discretion whether to apply for a U.S. government license for such positions, and Employer may decline to proceed with an applicant on this basis alone.