Skip to content

Latest commit

 

History

History
19 lines (9 loc) · 1.22 KB

File metadata and controls

19 lines (9 loc) · 1.22 KB

Challenges

CA Challenges Repository

Matching Challenge

An important task for our work is to link records from different sources. Such matched or linked records can be very valuable as they provide a solid labelled data set, which can be used for predictive modelling purposes. Your task is to write a matching routine, which links ids (id2) in the matching file (DatasetToMatch.csv) to the ids (id1) in the core data set (DatasetCore.csv). In order to establish these links you can use any personal and contact information present in the files.

Modelling Challenge

We have purchased a large record of consumer data for the 2,000,000 customers provided by ACME, and an additional 500,000 records for potential future targets. These include various demographic details and also a selection of proprietary Consumer Expenditure models for each consumer.

Scala Challenge

A client has provided us with a copy of their membership database. Your task is to write a small ETL pipeline to ingest the data and run some basic queries to get an initial feel.

The data will eventually be used to reference a series of commercial features from another data set via the vendor_id.