Talent.com
Tawaran kerja ini tidak tersedia di negara anda.
Spark Engineer

Spark Engineer

Qi GroupPetaling Jaya, Selangor, Malaysia
1 hari lalu
Penerangan pekerjaan

We are seeking a highly skilled

Spark Engineer

to design, build, and optimize large-scale data processing systems using

Apache Spark

  • . The ideal candidate will have deep expertise in distributed data processing, ETL pipelines, and performance tuning for high-volume data environments. You will collaborate with data scientists, analysts, and engineers to ensure scalable, reliable, and efficient data solutions.

Key Responsibilities :

  • Design, develop, and maintain big data solutions using Apache Spark (Batch and Streaming).
  • Build data pipelines for processing structured, semi-structured, and unstructured data from multiple sources.
  • Optimize Spark jobs for performance and scalability across large datasets.
  • Integrate Spark with various data storage systems (HDFS, S3, Hive, Cassandra, etc.).
  • Collaborate with data scientists and analysts to deliver robust data solutions for analytics and machine learning.
  • Implement data quality checks, monitoring, and alerting for Spark-based workflows.
  • Ensure security and compliance of data processing systems.
  • Troubleshoot and resolve data pipeline and Spark job issues in production environments.
  • Required Skills & Qualifications

  • Bachelor's degree
  • in Computer Science, Engineering, or related field (Master's preferred).

  • 3+ years
  • of hands-on experience with

    Apache Spark

    (Core, SQL, Streaming).

  • Strong
  • programming skills

    in

    Scala

    Java

    , or

    Python

    (PySpark).

  • Solid understanding of
  • distributed computing concepts

    and

    big data ecosystems

    (Hadoop, YARN, HDFS).

  • Experience with
  • data serialization formats

    (Parquet, ORC, Avro).

  • Familiarity with
  • data lake and cloud environments

    (AWS EMR, Databricks, GCP DataProc, or Azure Synapse).

  • Knowledge of
  • SQL

    and experience with

    data warehouses

    (Snowflake, Redshift, BigQuery is a plus).

  • Strong background in
  • performance tuning

    and

    Spark job optimization

  • Experience with
  • CI / CD pipelines

    and

    version control

    (Git).

  • Familiarity with
  • containerization

    (Docker, Kubernetes) is an advantage.

    This is a 12-month contract appointment, with the potential for extension or conversion to a permanent position subject to performance and business needs.

    Buat amaran kerja untuk carian ini

    Engineer • Petaling Jaya, Selangor, Malaysia