Role & responsibilities
Technical Skills:
Experience with regular expressions (Regex) for data parsing.
Strong knowledge of HTTP protocols, cookies, headers, and user-agent rotation.
Familiarity with databases (SQL and NoSQL) for storing scraped data. Hands-on experience with data manipulation libraries such as pandas and NumPy.
Bonus Skills: Knowledge of containerization tools like Docker.
Preferred candidate profile
Develop and maintain automated web scraping scripts using Python libraries such as BeautifulSoup, Scrapy, and Selenium.
Optimize scraping pipelines for performance, scalability, and resource efficiency.
Handle dynamic websites, CAPTCHA-solving, and implement IP rotation techniques for uninterrupted scraping.
Process and clean raw data, ensuring accuracy and integrity in extracted datasets.
Collaborate with cross-functional teams to understand data requirements and deliver actionable insights.
Leverage APIs when web scraping is not feasible, managing authentication and request optimization.
Document processes, pipelines, and troubleshooting steps for maintainable and reusable scraping solutions.
Ensure compliance with legal and ethical web scraping practices, implementing security safeguards.

Keyskills: Beautiful Soup Scrapy Selenium Web Scraping Http Protocol NoSQL Pandas Django Framework MongoDB Numpy SQL