Key Responsibilities
Data Ecosystem Development:
- Design and develop a robust, high-performance data ecosystem.
- Build and launch reliable data pipelines to extract, transform, and load both large and small data sets.
- Optimize existing pipelines and ensure smooth operation of domain-related data flows.
- Implement data quality checks to maintain high data integrity.
- Leverage DevOps practices to develop automated CI/CD processes that expedite testing, deployment, and promotion of analytics models.
- Develop, test, deploy and maintain efficient streaming and batch data ingestion pipelines.
- Document and enforce architecture and coding standards for all supported platforms.
- Participate in all stages of the software development lifecycle—from requirements gathering to design, development, testing, and support.
- Work collaboratively with the Analytics team and cross-functional partners to deliver innovative solutions.
- Proactively identify technology risks and seek guidance as needed.
- Embrace agile methodologies and adhere to established quality and management procedures.
Basic Qualifications
Education:
- Bachelor’s/Masters degree in Computer Science, Computer Engineering.
- Databricks or Data engineering certification will be preferred.
Technical Experience:
- 4+ years of development experience using Python, PySpark or another modern programming language.
- Advanced knowledge in data modeling techniques such as Database Normalization, Entity Relationship
- Hands-on expertise in designing and managing data pipelines across diverse sources and targets (e.g. Databricks, SQL Server, Data Lakes, distributed-based systems, SQL and No-SQL databases).
- Proven experience in batch and real-time ETL pipelines implementation, and maintenance.
- Experience with common data engineering tools(e.g., Git, JIRA, Confluence)
If you’re ready to drive innovation and deliver impactful analytics solutions at NOV, we’d love to hear from you.