Job Summary:
We are seeking a skilled Web Scraping Developer to design and develop scalable and efficient web scraping applications to extract data from various websites. The ideal candidate should be proficient in PHP/Python, web scraping frameworks (Scrapy, Beautiful Soup), MySQL, HTML/CSS and Javascript. This role requires strong analytical thinking, problem-solving abilities, and a proactive approach to tackling challenges.
Key Responsibilities:
1. Requirement Analyzing:
Understand the requirement from the Project Manager via the Project Management tool.
Analyze tasks and discuss them with the team.
Record all details in the Project Management Tool, including time estimation for completion.
2. Data Extraction:
Research and identify websites, APIs, and data sources for extraction.
Analyze website structures (HTML, CSS, JavaScript) to determine the best extraction approach.
Develop scripts using PHP, Python, or JavaScript to extract data.
Implement techniques to bypass anti-scraping measures (e.g., CAPTCHAs, rate limiting, IP blocking).
Store extracted data in structured formats (JSON, CSV).
3. Data Processing & Storage:
Clean and preprocess extracted data to ensure accuracy and completeness.
Transform data into required formats (JSON, CSV).
Handle missing data using imputation techniques or data cleaning methods.
Design a database schema to store extracted data securely.
Implement data validation and error-handling mechanisms.
4. Troubleshoot Issues:
Identify and resolve issues related to web scraping applications.
Ensure data extraction remains functional despite website changes.
Keyskills: python php html css web scraping project managers javascript
Company Name : Angel and Genie Company Profile : Angel and Genie is an India based Recruitment Firm focused on providing Recruitment, Staffing and RPO Services to our Clients across the Globe. We have our Recruiters & Head Hunters based out of locations across India. Our Clientele include...