Skip to content

Business case to supply to candidates for data analyst roles

Notifications You must be signed in to change notification settings

stordco/ds_data_analyst_case

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 

Repository files navigation

STORD Data Analytics Exercise

The goal of this exercise is to test your ability to tease out non obvious insights from data. We are looking for candidates with sound statistics knowledge and strong analytical capabilities.

The dataset provided to you contains car sales listings and descriptions from Craigslist. Your task is to dig into this dataset and come up with the following aggregations:

  • Insightful descriptive statistics that indicate a trend in the data (you can combine columns or use a subset of columns, entirely up to you. Try and be as creative as possible and extract as many non-obvious metrics as you can)
  • Groups for your data such that all rows in your data can be easily categorized into groups with similar features. Make sure to list out the features for each group.
  • Optional: Develop a model that predicts the price listing of a car depending on its features.
  • HINT: Make sure you identify and remove outliers in the dataset that may indicate data corruption or skewed results on analysis.
  • HINT: Feel free to choose a subset or combination of columns for grouping the data and in your prediction model

The Data

You can download the data from here

Data fields

  • id - entry ID
  • url - listing URL
  • region - craigslist region
  • region_url - region URL
  • price - entry price
  • year - entry year
  • manufacturer - manufacturer of vehicle
  • model - model of vehicle
  • condition - condition of vehicle
  • cylinders - number of cylinders
  • fuel - fuel type
  • odometer - miles traveled by vehicle
  • title_status - title status of vehicle
  • transmission - transmission of vehicle
  • vin - vehicle identification number
  • drive - type of drive
  • size - size of vehicle
  • type - generic type of vehicle
  • paint_color - color of vehicle
  • image_url - image URL
  • description - listed description of vehicle
  • county - county
  • state - state of listing
  • lat - latitude of listing
  • long - longitude of listing

Deliverable

  • Implement your solution in the tool of your choice (Python or R preferred, can use a visualization tool to supplement)
  • Email the point of contact that sent you this exercise.
    • Include all relevant statistical graphs
    • Make sure to label all graphs
    • Explain your thought process and add comments where needed
    • Summarize your findings/takeaways from the assignment
  • Be prepared to walk through your solution during the interview with the hiring manager.

Evaluation

Your submission will be evaluated along the following criteria by the Reviewer

  • Completeness - Does your submission meet the Deliverables specified above?
  • Business Acumen - How easy was it for a business stakeholder (with limited technical knowledge) to understand your solution?
  • Storylining - Evaluate the flow of your solution
  • Creativity - Evaluate your ability to tease out non-obvious insights from the data
  • Technical Capability - How statistically reliable are your insights?
  • Critical Thinking - Please be ready to speak to your thinking proccess for the solution here
  • Attention to Detail - How did you design your solution?

Notes:

  • Please feel free to ask clarifying questions via email!
  • Thank you for the time you are spending as a candidate with STORD!
  • Please return your completed assignment within 5 days of receipt

About

Business case to supply to candidates for data analyst roles

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published