Job Summary :
We are seeking a highly skilled Big Data Engineer with 8 years of experience in data migration , data setup , and data systems development . The ideal candidate will have deep expertise in Apache Spark , SQL , and Java (with Scala) for large-scale data processing, reporting, and system development. Strong knowledge of data architecture , semantic layer development , and experience in regression testing and cutover activities for enterprise-level migrations is essential.
Key Responsibilities :
Spark :
- Design, develop, and optimize Spark-based ETL pipelines for large-scale data processing and analytics.
- Utilize Spark SQL, DataFrames, RDDs, and Streaming for efficient data transformations.
- Tune Spark jobs for performance, including memory management, partitioning, and execution plans.
- Implement real-time and batch data processing using Spark Streaming or Structured Streaming.
SQL :
Write and optimize complex SQL queries for data extraction, transformation, and aggregation.Perform query performance tuning, indexing, and partitioning for efficient execution.Develop stored procedures, functions, and views to support data operations.Ensure data consistency, integrity, and security across relational databases.Java (Preferred with Scala Knowledge) :
Develop backend services and data processing applications using Java and Scala.Optimize JVM performance, including memory management and garbage collection, for Spark workloads.Leverage Scala’s functional programming capabilities for efficient data transformations.Implement multithreading, concurrency, and parallel processing in Java for high-performance systems.Required Skills & Qualifications :
8+ years of experience in data engineering, with a focus on big data technologies.Strong proficiency in Apache Spark , SQL , and Java / Scala .Experience in data migration , data setup , and semantic layer development .Solid understanding of data architecture , ETL frameworks , and data governance .Hands-on experience with regression testing and cutover planning in large-scale data migrations.Familiarity with cloud platforms (e.g., AWS, Azure, GCP) is a plus.Excellent problem-solving and analytical skills.Strong communication and collaboration abilities.Preferred Qualifications :
Experience with Hadoop ecosystem tools (Hive, HDFS, Oozie, etc.).Knowledge of containerization and orchestration (Docker, Kubernetes).Exposure to CI / CD pipelines and DevOps practices.Relevant certifications in Big Data or Cloud technologies.#J-18808-Ljbffr