Database Systems Engineer, Cassandra Storage ASE

Apple • Full-time • San Francisco, CA, US • $130k - $180k / year • 3w ago

Summary
Imagine what you could do here. At Apple, new ideas have a way of becoming great products, services, and customer experiences very quickly. Bring passion and dedication to your job and there's no telling what you could accomplish.

Apple is seeking experienced database systems engineers to join our Cassandra Storage team. Apple's Cassandra organization develops and contributes to Apache Cassandra, an open source distributed database powering many of Apple's most critical internet services. You will be joining a team of experts, working at the cutting edge of modern database technology, distributed systems and storage engineering to evolve Apache Cassandra. The team's work is deployed at massive scale. It also has big impact, providing the storage platform upon which many iCloud, Media, and other internet services at Apple are built. Your work will benefit all users of Apple products and is critical to the success of current and future offerings.

Description
Apple's Cassandra Storage team develops storage systems that are correct, reliable, scalable, and fast. This work requires an innovative spirit and an extraordinary degree of care and rigor in engineering. Team members contribute to all major components of Apache Cassandra, including query coordination and execution, replication and persistence, transactions and consensus, compaction, client and internode messaging, and all other aspects of the database.

As a member of this team, you will build and evolve major components of the database. These areas include:

* Traffic and load balancing
* Security and authorization
* Quota and rate limiting
* Tenant isolation

Success in this role requires expertise in several of the following and desire to gain experience in others:

* Fundamentals of system-level hardware and networking components (storage devices and controllers, network interfaces, CPU and memory layout in server-class systems).
* Operating systems concepts (process scheduling, disk and network I/O, performance).
* Datacenter architecture (networking topologies, host placement strategies, and failure modes); design of multi-datacenter systems; failure domains; and wide-area networking.
* Understanding of distributed systems concepts (fallacies of distributed computing, CAP, FLP, etc).
* Understanding of database concepts (consistency models, isolation levels, crash and recovery semantics).
* Advanced concepts such as failure detection, smart clients, load balancing, request pipelining, speculation / retry policies, and operational semantics of high-throughput distributed systems.
* Performance engineering (design concepts, profile-guided optimization).
* Software validation concepts (fault injection, property-based testing and model checking, workload replay, quality metrics).
* This role also requires excellent communication, ability to partner with our Site Reliability peers, and a high degree of customer focus when engaging with internal platform customers. Ability to work effectively with colleagues based in other locations is also essential; experience in this area is a plus. Prior experience with development of distributed databases / storage systems is recommended.

Apply