Your browser does not support javascript! Please enable it, otherwise web will not work for you.

Senior Python Developer @ Shaip Ai Data (india)

Home > Operations

 Senior Python Developer

Job Description

Role Overview

We are looking for an experienced Python Developer with expertise in large-scale dataset validation and automation pipelines.

The role involves designing scalable and production-grade scripts to process and validate image, video, and audio datasets, detect anomalies (UUID mismatch, metadata issues, frequency checks, corruption), and automate reporting and error handling workflows.

The ideal candidate will have strong software engineering skills, experience with cloud platforms (AWS/GCP/Azure), and building robust data validation pipelines for high-volume media datasets.

Key Responsibilities

  • Build advanced validation frameworks for image, video, and audio datasets (metadata checks, file validation, resolution and format checks, corruption detection, UUID validation).
  • Develop automated pipelines for validation, error reporting, and summary dashboards.
  • Integrate cloud storage (AWS S3/GCP/Azure Blob) for direct dataset processing.
  • Implement parallel processing and multiprocessing to handle millions of files efficiently.
  • Write modular, reusable, and production-ready code with proper logging and exception handling.
  • Build command-line tools or APIs for internal teams to trigger validation workflows.
  • Collaborate with QA, Data Engineering, and ML teams to define dataset quality standards.
  • Maintain CI/CD workflows for automated script deployment and versioning.

Required Technical Skills

  • Advanced Python programming with experience in building scalable, production-ready scripts.
  • Expertise with libraries such as OpenCV, python, Pillow, PyDub, librosa, mutagen, pandas, boto3/google-cloud-storage.
  • Strong understanding of multithreading, multiprocessing, and performance optimization.
  • Experience in hashing (MD5/SHA) and UUID generation/validation for file integrity checks.
  • Hands-on experience with cloud platforms (AWS/GCP/Azure) for file I/O operations at scale.
  • Knowledge of data pipeline orchestration tools (Airflow, Prefect, etc.) is a plus.
  • Familiarity with Docker, CI/CD tools (GitHub Actions, Jenkins, GitLab CI) for script deployment.
  • Ability to design robust error reporting systems (logs, JSON/CSV reports, dashboards).

Good to Have

  • Experience with media processing frameworks (FFmpeg, GStreamer).
  • Knowledge of database integration (PostgreSQL, MongoDB, Elasticsearch) for metadata storage.
  • Exposure to ML dataset curation workflows.
  • Understanding of API development (FastAPI/Flask) to create validation endpoints.

Qualifications

  • Bachelor's/Masters degree in Computer Science, Software Engineering, or related field.
  • 35 years of experience in Python-based automation and data validation at scale.
  • Prior experience in AI/ML data projects or large media dataset handling is highly preferred.

Key Deliverables

  • Develop scalable Python scripts to validate datasets across formats (image, video, audio).
  • Create automated error reports (CSV/JSON/PDF) and dashboards for stakeholders.
  • Build pipeline automation tools/APIs for dataset validation.
  • Optimize scripts for parallel execution on large datasets.
  • Ensure seamless integration with cloud storage & CI/CD workflows.

Job Classification

Industry: IT Services & Consulting
Functional Area / Department: Customer Success, Service & Operations
Role Category: Operations
Role: Operations - Other
Employement Type: Full time

Contact Details:

Company: Shaip Ai Data (india)
Location(s): Ahmedabad

+ View Contactajax loader


Keyskills:   Github Dashboards Python AWS SQL

 Job seems aged, it may have been expired!
 Fraud Alert to job seekers!

₹ Not Disclosed

Similar positions

Senior Process Manager

  • eClerx
  • 10 - 17 years
  • Mumbai
  • 10 days ago
₹ 9-12 Lacs P.A.

Senior Executive Operations

  • Manipal Hospitals
  • 2 - 7 years
  • Pune
  • 1 month ago
₹ 5-9 Lacs P.A.

Senior Associate - Delivery Operations

  • Srijan
  • 3 - 6 years
  • Noida, Gurugram
  • 2 mths ago
₹ Not Disclosed

Senior Manager/ DGM Interior Operations Works

  • Godrej Properties
  • 10 - 20 years
  • Mumbai
  • 2 mths ago
₹ Not Disclosed

Shaip Ai Data (india)

Shaip.AI Data India LLP