Big data/Hadoop Engineer
Job Title: Big data/Hadoop Engineer
Location: Plano, TX
Required:
– 1-2 years, creating, maintaining and managing Hadoop clusters
– 3-5 years of experience with development to centered around big data applications, and adhoc transformation of unstructured raw data
– 1-2 years Relational DBA experience, preferably with experience in SQLServer and/or MySQL.
– Design, build, and maintain Big Data workflows/pipelines to process continuous stream of data with experience in end-to-end design and build process of Near-Real-Time and Batch Data Pipelines.
– Demonstrated work experience in the following with Big Data and distributed programming models and technologies
– Knowledge of database structures, theories, principles and practices (both SQL and NoSQL).
– Active development of ETL processes using Spark or other highly parallel technologies, and implementing ETL/data pipelines
– Experience with Data technologies and Big Data tools, like Spark, Kafka, Hive
– Understanding of Map Reduce and other Data Query and Processing and aggregation models
– Understanding of challenges of transforming data across distributed clustered environment
– Experience with techniques for consuming, holding and aging out continuous data streams
– Ability to provide quick ingestion tools and corresponding access API’s for continuously changing data schema, working closely with Data Engineers around specific transformation and access needs
Preferred:
– Experience as Database administrator (DBA) will be responsible for keeping critical tools database up and running
– Building and managing high availability environments for databases and HDFS systems
– Familiarity with transaction recovery techniques and DB Backup
Hybrid Hadoop Engineer and Hadoop Infrastructure Administrator to build and maintain a scalable and resilient Big Data framework to support Data Scientists. As an administrator, responsibility will be to deploy and maintain Hadoop clusters, adding and removing nodes using cluster management and monitoring tools like Cloudera Manager, support performance and scalability requirements, in support of our Data scientists needs. Some Relational Database administrator experience will also be desirable to support general administration of Relational Databases.
Skills and Attributes:
1. Ability to have effective working relationships with all functional units of the organization
2. Excellent written, verbal and presentation skills
3. Excellent interpersonal skills
4. Ability to work as part of a cross-cultural team
5. Self-starter and Self-motivated
6. Ability to work without lots of supervision
7. Works under pressure and is able to manage competing priorities.
Technical qualifications and experience level:
1. At least 5 years of combined proven working experience as a Spark/Big Data developer, DBA and Hadoop Admin
2. 5-10 years in development using Java, Python, Scala, and object-oriented approaches in designing, coding, testing, and debugging programs
3. Ability to create simple scripts and tools.
4. Development of cloud based, distributed applications
5. Understanding of clustering and cloud orchestration tools
6. Working knowledge of database standards and end user applications
7. Working knowledge of data backup, recovery, security, integrity and SQL
8. Familiarity with database design, documentation and coding
9. Previous experience with DBA case tools (frontend/backend) and third party tools
10. Understanding of distributed file systems, and their optimal use in the commercial cloud (HDFS, S3, Google File System, Datastax Delta lake)
11. Familiarity with programming languages API
12. Problem solving skills and ability to think algorithmically
13. Working Knowledge on RDBMS/ORDBMS like MariaDb, Oracle and PostgreSQL
14. Working knowledge on Hadoop administration.
15. Knowledge of SDLC (Waterfall, Agile and Scrum)
16. BS degree in a computer discipline is MUST