Key Job Responsibilities:
- Perform data extraction, cleaning, transformation, and flow. Web scraping may be also a part of the work scope in data extraction.
- Design, build, launch and maintain efficient and reliable large-scale batch and real-time data pipelines with data processing frameworks
- Integrate and collate data silos in a manner which is both scalable and compliant
- Collaborate with Project Manager, Frontend Developers, UX Designers and Data Analyst to build scalable data-driven products
- Be responsible for developing backend APIs & working on databases to support the applications
- Work in an Agile Environment that practices Continuous Integration and Delivery
Key Skills/Qualifications:
- Proficient in general data cleaning and transformation (e.g. pandas, R, etc)
- Proficient in building ETL pipeline (eg. Spring)
- Proficient in database design and various databases (e.g. mongodb, postgres/gis, mysql, sqlite, voltdb, cassandra, etc)
- Familiar with rest api and web requests/protocols in general
- Familiar with big data frameworks and tools (eg. Hadoop, Spark, Kafka, RabbitMQ)
- Familiar with W3C Document Object Model and customized web scraping (e.g. BeautifulSoup, CasperJS, PhantomJS, Selenium, Nodejs, etc)
- Comfortable in at least one scripting language (eg. Python)
- Comfortable in both windows and linux development environments
- Experience in cloud technologies such as GPC, GCC (i.e. AWS, Azure, Google Cloud)
- Experience and passion for data engineering in a big data environment using Cloud platforms such as GPC, GCC (i.e. AWS, Azure, Google Cloud)
- Experience with building production-grade data pipelines, ETL/ELT data integration
- Interest in being the bridge between engineering and analytics
- Knowledge about system design, data structure and algorithms
- Familiar with data modelling, data access, and data storage infrastructure like Data Mart, Data Lake, and Data Warehouse.
Location: Singapore
Job Type: Permanent, Full-Time