Job posting has expired
Video Site Reliability Engineer (SRE) - Apple Media Products
![]() | |
![]() United States, San Diego | |
![]() | |
Summary
Posted: Apr 3, 2023 Role Number:
200473038 Imagine what you could do here. At Apple, new ideas have a way of becoming great products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish! Apple Media Product's SRE team is looking for a world-class Site Reliability Engineer with experience in developing processes, tools, and automation for managing distributed systems in production environments. Our SRE team combines software and systems engineering and system administration practices to build and run large-scale, massively distributed, fault-tolerant systems. Our software ensures that Apple's services are reliable, scalable and secure, and we leverage both open source and home-grown technologies to provide managed data infrastructure services. We balance our time across automating operations for our growing footprint of deployments, building self- service products to empower internal customers, and increasing the reliability and scalability of our services with application and systems- level improvements. Dynamic, smart people and inspiring, innovative technologies are the norms here. Will you join us in crafting solutions that do not yet exist? Key Qualifications
Description The Video SRE team seeks a driven, curious engineer who is excited to contribute to our mission of making the Apple TV app the best place to watch all your favorite movies, shows and sports. Our team is passionate about delivering the highest quality resilient infrastructure to power our services. The ideal candidate possesses a driven approach to continually improving service levels and an ability to understand large complex systems with a passion to constantly improve environments and processes to better serve our customers. You should have experience building and delivering self-healing platforms and services via automation. This role engages with engineering teams to improve service resilience from design to deployment to operation by building in DR, security, performance and reliability. Candidates must demonstrate a consistent track record of troubleshooting and resolving issues in live production environments and implementing strategies to eliminate them. Proficient coding experience using Python, Java, bash or similar languages required. You should have an ability to effectively use a relational database and SQL queries as well as a strong grasp of Linux systems, networking, and security, experience with monitoring tools such as Splunk, Prometheus, Grafana. Qualified candidates are able to work with large-scale (virtual and on-prem) deployments and have a demonstrated ability to deliver results on time with high quality and attention to detail. Useful knowledge of additional programming languages and platforms: Go, Kafka, Cassandra, Solr, Redis, databases. We seek a self starter with strong abilities to innovate and values building trust through quality work. You will interact with many other internal teams to lead and deliver best-in-class products in an exciting fast-paced environment. Education & Experience BS in engineering, computer science or other technical disciplines (or equivalent experience) plus 2-5+ years of related experience Additional Requirements
Pay & Benefits
|