We are seeking a skilled Data Engineer to join our team and help build and maintain robust data pipelines, optimize data processing workflows, and ensure the availability and reliability of data infrastructure. You will work closely with data scientists, analysts, and software engineers to enable seamless data-driven decision-making.
Design, develop, and maintain scalable ETL (Extract, Transform, Load) pipelines.
Build and optimize data warehouses, lakes, and pipelines for analytics and machine learning applications.
Collaborate with data scientists and analysts to ensure efficient data access and modeling.
Implement data quality checks, monitoring, and governance processes.
Optimize database queries and storage solutions for performance and cost-effectiveness.
Work with big data technologies such as Spark, Hadoop, or Kafka.
Develop and maintain API integrations for data ingestion and sharing.
Ensure security and compliance best practices are followed in data management.
Bachelor's or Masters degree in Computer Science, Data Engineering, or a related field.
2+ years of experience in Data Engineering or a related field (varies by level).
Proficiency in SQL and NoSQL databases (PostgreSQL, MySQL, MongoDB, etc.).
Hands-on experience with cloud platforms (AWS, Azure, GCP) and data warehouse solutions (Redshift, Snowflake, BigQuery).
Strong programming skills in Python, Java, or Scala for data processing.
Familiarity with ETL tools like Apache Airflow, Talend, or dbt.
Experience with big data frameworks like Spark, Hadoop, or Kafka is a plus.
Understanding of CI/CD practices for data pipeline automation.
Experience with streaming data technologies (Apache Flink, Kinesis, etc.).
Knowledge of machine learning workflows and MLOps.
Exposure to containerization and orchestration (Docker, Kubernetes).