Avail 15% off on First assignment order NAH_FIRST_15%

Logo
University Massey University (MU)
Subject 161.777 Practical Data Mining

161.777 Project part 1, 2026

Overview

This part of the project has two modelling exercises. In each exercise you will use the supplied *_train_val data to build and compare models, then generate predictions for a blind test set.

Use only code, methods, functions, and packages that have appeared in the course up to Week 9. Do not use any material from outside the course. Work must be your own and may be checked by oral interview if needed.

Start by downloading the project file by Right clicking and Save File As… here:

https://www.massey.ac.nz/~jcmarsha/161777/assessment/project_part1.Rmd

Then loading it into RStudio.

Files to submit

Submit these three files:

  1. ex1_studentid.csv
  2. ex2_studentid.csv
  3. report_studentid.html

Marking

Exercise Predictive performance Report
Exercise 1 15 marks, measured by RMSE 10 marks methodology explanation
Exercise 2 10 marks, measured by classification accuracy 10 marks methodology explanation + 5 marks result explainability

The report must be knitted from project_part1.Rmd and saved as report_studentid.html.

Keep the write-up to at most 500 words per exercise. For each exercise, briefly describe your methodology such as:

  • Exploratory analysis carried out
  • Data processing performed, including justification of how any missingness or unusual observations are handled
  • Modelling techniques tried, including any tuning or variable choices
  • Final model selected, and why

For Exercise 2, also provide a few concluding sentences about what differentiates the wheat varieties. You may support your conclusions with a table or figure.

Submission Rules

  • Start from the supplied test file.
  • Keep the row order unchanged.
  • Do not alter the supplied predictor columns.
  • Add exactly one prediction column and no extra columns.
  • For Exercise 1, name the prediction column .pred
  • For Exercise 2, name the classification column .pred_class
  • For Exercise 2, submit class labels, not probabilities.
  • Use only property_train_val.csv and wheat_train_val.csv for model selection.
  • Do not use property_test.csv or wheat_test.csv to choose models.

Exercise 1: Property Value Prediction [25 marks]

This exercise concerns residential property values in New Taipei City, Taiwan.

Use property_train_val.csv to build one or more prediction models for price_per_unit, then predict price_per_unit for every row in property_test.csv.

property_train_val.csv contains 310 labelled observations. property_test.csv contains 104 unlabelled observations.

Variable Description
sale_time Decimal year of sale
property_age Property age in years
transit_distance Distance to the nearest rapid-transit station
nearby_stores Number of nearby convenience stores
latitude Latitude
longitude Longitude
price_per_unit Property value per unit area, available only in property_train_val.csv

Submit ex1_studentid.csv as the supplied property_test.csv data with one extra column named .pred.

Exercise 2: Wheat Variety Classification [25 marks]

This exercise concerns wheat kernels from three varieties.

Use wheat_train_val.csv to build one or more classification models for variety, then classify every row in wheat_test.csv.

wheat_train_val.csv contains 156 labelled observations. wheat_test.csv contains 54 unlabelled observations.

Variable Description
kernel_area Kernel area
kernel_perimeter Kernel perimeter
kernel_compactness Compactness measure
kernel_length Kernel length
kernel_width Kernel width
kernel_asymmetry Asymmetry measure
groove_length Groove length
variety Wheat variety, available only in wheat_train_val.csv

Besides methodology description, provide a paragraph concluding what differentiates the wheat varieties. You may support your conclusions with a table or figure. This part is marked out of 5 marks.

Submit ex2_studentid.csv as the supplied wheat_test.csv data with one extra column named .pred_class.

Do You Need Custom-Written Solution For 161.777 Practical Data Mining Assignment Project

Hire NZ Native Experts 24/7.

Get Help By Expert

If you are finding it challenging to complete your 161.777 Practical Data Mining assignment project at Massey University—especially with model building, RMSE optimisation, or classification accuracy—you’re not alone. Many students prefer NZ Assignment Help for dependable academic support. Our experts deliver assignment help aligned with your course requirements. You can also check our massey university assignment questions for better clarity. Get started today with our do my assignment for me service and receive a fully customised, human-written, plagiarism-free solution.

Answer

UP TO 15 % DISCOUNT

Get Your Assignment Completed At Lower Prices

Plagiarism Free Solutions
100% Original Work
24*7 Online Assistance
Native PhD Experts
Hire a Writer Now

Facing Issues with Assignments? Talk to Our Experts Now! Download Our App Now!