We are seeking a highly skilled
Spark Engineer
to design, build, and optimize large-scale data processing systems using
Apache Spark
- . The ideal candidate will have deep expertise in distributed data processing, ETL pipelines, and performance tuning for high-volume data environments. You will collaborate with data scientists, analysts, and engineers to ensure scalable, reliable, and efficient data solutions.
Key Responsibilities :
Design, develop, and maintain big data solutions using Apache Spark (Batch and Streaming).Build data pipelines for processing structured, semi-structured, and unstructured data from multiple sources.Optimize Spark jobs for performance and scalability across large datasets.Integrate Spark with various data storage systems (HDFS, S3, Hive, Cassandra, etc.).Collaborate with data scientists and analysts to deliver robust data solutions for analytics and machine learning.Implement data quality checks, monitoring, and alerting for Spark-based workflows.Ensure security and compliance of data processing systems.Troubleshoot and resolve data pipeline and Spark job issues in production environments.Required Skills & Qualifications
Bachelor's degreein Computer Science, Engineering, or related field (Master's preferred).
3+ yearsof hands-on experience with
Apache Spark
(Core, SQL, Streaming).
Strongprogramming skills
in
Scala
Java
, or
Python
(PySpark).
Solid understanding ofdistributed computing concepts
and
big data ecosystems
(Hadoop, YARN, HDFS).
Experience withdata serialization formats
(Parquet, ORC, Avro).
Familiarity withdata lake and cloud environments
(AWS EMR, Databricks, GCP DataProc, or Azure Synapse).
Knowledge ofSQL
and experience with
data warehouses
(Snowflake, Redshift, BigQuery is a plus).
Strong background inperformance tuning
and
Spark job optimization
Experience withCI / CD pipelines
and
version control
(Git).
Familiarity withcontainerization
(Docker, Kubernetes) is an advantage.
This is a 12-month contract appointment, with the potential for extension or conversion to a permanent position subject to performance and business needs.