Mô Tả Công Việc
· Building and optimizing ‘big data’ data pipelines, architectures and data sets;
· Design, develop, and maintain data pipelines, warehouses, datalake.
· Build the data products that technical users will depend on for business intelligence and ad-hoc access.
· Work side by side with our Data Science to build and automate data pipeline, data ETL, etc. on distributed data processing platform such as Spark.
· Prepare data inputs for a generic blueprint "model builder"
· Build production data pipeline for daily ETL and model retraining
· End-to-end data processing, troubleshooting, and problem diagnosis.
Yêu Cầu Công Việc
· Have 4 years of experience in a DE position.
· Good logical thinking;
· Passionate about coding and programming, innovation, and solving challenging problems;
· Understanding of computer science fundamentals (data structures and algorithms, operating systems, networks, databases, etc);
· Willing to work with cross-functional teams in a dynamic & fast-paced environment;
Preferred Qualifications (is plus):
· Big data tools: Kafka, Hadoop, Hive, Spark, Elastic Search;
· Relational SQL and NoSQL databases: MySQL, MongoDB, Hbase, Cassandra, Clickhouse;
· Data pipeline and workflow management tools: Airflow, Nifi;