Senior Data Engineer

з/п не указана

Вакансия в архиве

Работодатель, вероятно, уже нашел нужного кандидата и больше не принимает отклики на эту вакансию

Показать описание вакансии

Требуемый опыт работы: 3–6 лет

Полная занятость, гибкий график

About us

Profitero provides world leading eCommerce SaaS analytics solution, which assists our clients - worlds-class brands - to make valuable decisions on how to make business better.

We are inviting senior data engineer or senior software developer eager to work with data processing to join our R&D team to help us maintain and implement various analytic, machine learning and data manipulation modules.

R&D team to support company’s product works on the following directions:

  • Products sales estimation of online retailers like Amazon, Tmall, etc.

  • Causal Analytics to explain sales changes and suggest actions to boost them

  • Similar products detection

  • Products classification into client’s custom categories

  • Product’s meta-information recognition - brands, packing, colors, etc.

  • Aspect based sentiment analysis of product reviews

Data Engineer will work with ETL pipelines serving above directions, and develop custom software components and analytics applications.

How much data we have:

  • About 22 Tb in Clickhouse cluster

  • About 4 Tb in Mysql

  • About 10 Tb in Cassandra cluster

  • Planning to have up to 500 Tb of raw data in Hadoop cluster

What data we have:

  • Daily updates of 35M products on 9 amazon websites

  • Additionally daily updates of 12M products on 6k other websites

  • 1.5 Tb of various events registered in entire system for half a year, and it’s only the beginning

Responsibilities:

  • Design, construct, install, test and maintain highly scalable data management systems

  • Ensure systems meet business requirements and industry practices

  • Build high-performance algorithms, prototypes, predictive models and proof of concepts supporting Data Scientists

  • Research opportunities for data acquisition and new uses for existing data

  • Create custom software components and analytics applications

  • Employ a variety of languages and tools (e.g. scripting languages) to marry systems together - we work with Kotlin, Java, Python, Ruby, Bash

  • Be passionate about data quality, develop monitoring system, recommend ways to improve data reliability, efficiency and quality

  • Install and update disaster recovery procedures

  • Plan and organize storage volumes increase, data replication and migration

  • Collaborate with data scientists, system architects and IT team members on project goals

Skills & Requirements:

- 3+ years of experience with some significant object-oriented language: Java, C++, C#, Python, PHP, etc. Ability to learn and use any language
- Solid understanding of algorithms, data structures and their usage
- Advanced knowledge of multithreading and multiprocessing programming
- Solid understanding of performance optimization
- Good knowledge of relational databases (e.g. MySQL)
- Experience with Linux\Docker\Git\Jenkins.
- Experience with Hadoop stack (HDFS, Spark, MapRed, Hbase, Hive, etc.)
- Experience with messaging systems (e.g. RabbitMQ, Apache Kafka)
- Intermediate level of English

Ключевые навыки

KotlinJavaMySQL

Адрес

Немига, Минск, проспект Победителей, 7а
Показать на карте
­

Вакансия опубликована 20 мая 2019 в Минске

Написать сопроводительное письмоПисьмо отправлено

Сопроводительное письмо к отклику