Sr Big Data Engineer
Role: Sr Big Data Engineer
Location: Detroit, MI(Fully Remote)
Duration: Long Term
Job Description:
The Senior Big Data Engineer is responsible for engaging in the design, development and maintenance of the big data platform and solutions. This includes analytical solutions that provide visibility and decision support using big data technologies.
The role involves administering a Hadoop cluster, developing data integration solutions, and working with data scientists, system administrators, and data architects to ensure the platform meets business demands.
Responsibilities
- Develop ETL processes from various data repositories and APIs across the enterprise, ensuring data quality and process efficiency
- Develop data processing scripts using Spark
- Develop relational and NoSQL data models to help conform data to meet users’ needs using Hive and HBase
- Integrate platform into the existing enterprise data warehouse and various operational systems
- Develop administration processes to monitor cluster performance, resource usage, backup and mirroring to ensure a highly available platform
- Address performance and scalability issues in a large-scale data lake environment
Minimum Qualifications
- Bachelor’s degree in computer science, information technology, or a related field or equivalent experience
- 4 years of experience with big data/Hadoop distribution and ecosystem tools, such as Hive, HBase, Spark, Kafka, NiFi and Oozie
- 1+ Year of AWS ecosystem technologies EMR Clusters, Glue Jobs, Athena
- 4 years of experience developing batch and streaming ETL processes – Spark Streaming preferred, Kafka is also a big plus
- 4 years of experience working with relational and NoSQL databases, including modeling and writing complex queries
- Master’s degree in computer science, information technology, or a related field or equivalent experience
- Experience with programming languages, such as Python, Java or Scala
- Experience with Linux system administration, Linux scripting and basic network skills
- Experience coding against and developing REST APIs
Top Three Skills:
- Excellent knowledge of Spark and Spark Streaming
- Proficiency with Python, Scala, or Java programming within Spark to build data pipelines
- AWS Serverless Analytics tools(EMR, Glue, Athena)