Data Manipulation, Pandas and Distance Metrics -- 2

Cancelled Posted 7 years ago Paid on delivery
Cancelled Paid on delivery

Write the Python code to load the CSV file from spatialkey directly into a Pandas dataframe.

• You may need to familiarize yourself with the IO Tools functions of Pandas.

• Turn in the fragment of code that does this.

§ Compute the price per square foot for all properties. Add this data back into the dataframe.

• See the documentation on concatening objects.

• Turn in the code that does this, and show the results of head() and tail() on the new dataframe.

§ Provide three data points from the original dataset that give your reason to be concerned about the some of the properties with unusual price per square foot data.

• Learn about summarizing your data and sorting.

• Turn in the rows of those three data points and 1 to 3 sentences discussing your concerns.

§ Use a scatter plot to plot the sale price to the square feet.

§ Do the same for number of beds to sale price.

• You will need to learn about scatter plots to complete this.

• For both, turn in the scatter plot &#emdash; if you're using Jupyter, just leave the data inline.

§ Explore the distribution of properties by number of beds. Plot this using plot() (learn about that function here).

PART 2 / Distance metrics and k-Nearest Neighbor

§ Implement the k-Nearest Neighbor algorithm in Python using the Euclidean distance metric.

• Turn on the code for this implementation in Python.

• You implementation will take two vectors and return a single number. Use the template :

def my_knn(vec1, vec2): # your implementation return d # the_euclidean_distance of vec1 and vec2

§ Produce the distance table using your version of kNN for all properties.

• To make things easier, please reduce the data vector to just beds, bath, square footage, price, latitude and longitude.

• You can use the street as the index. Your final output will look something like this:

Data Mining

Project ID: #11669827

About the project

Remote project Active 7 years ago