Posts

Showing posts from July, 2020

Location, Location, Location: Real Estate Data Analysis

Image
For my most recent data science project I was tasked with creating a business case related to a provided dataset, and then solving it.  The data in question here described a few years of home sales in King County, Washington, along with plenty of variables describing each home sold.  I knew right away that I wanted to use price as my outcome variable, since most business cases that came to mind would use it as the deciding factor.  Examining the rest of the variables in the dataset, most had a fairly obvious relationship to price (at least when looked at from a high level overview).  For example, I figured that more bedrooms, bathrooms, or square footage would all lead to higher prices, so these didn't really interest me for a business case.  What caught my eye instead was the location data associated with each entry.  Every house had fairly precise latitude and longitude coordinates as variables, as well as a ZIP code.  These are two components that I thought did not have any obvi