Job Summary :
We are looking for a Big Data Hadoop Developer to design, develop, and maintain large-scale data processing solutions. The ideal candidate should have strong hands-on experience with the Hadoop ecosystem and integration with relational databases such as MariaDB or Oracle DB for analytics and reporting.
Key Responsibilities :
- Design, develop, and optimize Hadoop-based big data solutions for batch and real-time data processing.
- Work with data ingestion frameworks to integrate data from MariaDB / Oracle DB into Hadoop (Sqoop, Apache Nifi, Kafka).
- Implement Hive, Spark, and MapReduce jobs for data transformation and analytics.
- Optimize Hive queries, Spark jobs, and HDFS usage for performance and cost efficiency.
- Create and maintain ETL pipelines for structured and unstructured data.
- Troubleshoot and resolve issues in Hadoop jobs and database connectivity.
- Collaborate with BI, analytics, and data science teams for data provisioning.
- Ensure data security, governance, and compliance in all solutions.
Technical Skills :
Big Data Ecosystem : Hadoop (HDFS, YARN), Hive, Spark, Sqoop, MapReduce, Oozie, Flume.Databases : MariaDB and / or Oracle DB (SQL, PL / SQL).Programming : Java, Scala, or Python for Spark / MapReduce development.Data Ingestion : Sqoop, Kafka, Nifi (for integrating RDBMS with Hadoop).Query Optimization : Hive tuning, partitioning, bucketing, indexing.Tools : Ambari, Cloudera Manager, Git, Jenkins.OS & Scripting : Linux / Unix shell scripting.Soft Skills :
Strong analytical skills and problem-solving abilities.Good communication skills for working with cross-functional teams.Ability to manage priorities in a fast-paced environment.Nice to Have :
Experience with cloud-based big data platforms (AWS EMR, Azure HDInsight, GCP Dataproc).Knowledge of NoSQL databases (HBase, Cassandra).Exposure to machine learning integration with Hadoop / Spark.