Key responsibilities include:
- Build and maintain big data technologies, environments, and applications, seeking opportunities for improvements and efficiencies
- Performing ETL (Extract, Transform, Load) processes based on business requirements using Apache Spark and data ingestion from Apache Kafka
- Work with various data platforms including Apache Hadoop, Apache Ozone, AWS S3, Delta Lake, Apache Iceberg
- Utilizing orchestration tools like Apache NiFi for managing and scheduling data flows efficiently
- Writes performant SQL statement to analyze data with Hive/Impala/Oracle
- Full application development lifecycle (SDLC) from design to deployment
- Working with multiple stakeholders across teams to fulfill ad-hoc investigations, including large-scale data extraction, transformation, and analyses
About You:
- Excellent problem-solving skills with good understanding of data engineering concepts
- Proficient in Apache Spark with Python and related technologies
- Strong knowledge of SQL and performance tuning
- Experience in Big Data technologies like Hadoop and Oracle Exadata
- Solid knowledge of Linux environments and proficiency with bash scripting
- Effective verbal and written communication skills
Nice to have knowledge or prior experience with any of the following
- Apache Kafka, Apache Spark with Scala
- Orchestration with Apache Nifi, Apache Airflow
- Java development and microservices architecture
- Build tools like Jenkins
- Log analysis and monitoring using Splunk
- Databricks, AWS
- Working with large data sets with terabytes of data
Must Have Skills:
Nice to Have Skills:
- Hadoop
- Oracle
- NIFI
- Power BI
- Scala
- Machine Learning
Job Types: Full-time, Contract
Pay: $39.00 - $40.00 per hour
Expected hours: 8 per week
Benefits:
- 401(k)
- Dental insurance
- Health insurance
Schedule:
Work Location: In person