| University | Massey University (MU) |
| Subject | 161.777 Practical Data Mining |
161.777 Project part 1, 2026
Overview
This part of the project has two modelling exercises. In each exercise you will use the supplied *_train_val data to build and compare models, then generate predictions for a blind test set.
Use only code, methods, functions, and packages that have appeared in the course up to Week 9. Do not use any material from outside the course. Work must be your own and may be checked by oral interview if needed.
Start by downloading the project file by Right clicking and Save File As… here:
https://www.massey.ac.nz/~jcmarsha/161777/assessment/project_part1.Rmd
Then loading it into RStudio.
Files to submit
Submit these three files:
ex1_studentid.csvex2_studentid.csvreport_studentid.html
Marking
| Exercise | Predictive performance | Report |
|---|---|---|
| Exercise 1 | 15 marks, measured by RMSE | 10 marks methodology explanation |
| Exercise 2 | 10 marks, measured by classification accuracy | 10 marks methodology explanation + 5 marks result explainability |
The report must be knitted from project_part1.Rmd and saved as report_studentid.html.
Keep the write-up to at most 500 words per exercise. For each exercise, briefly describe your methodology such as:
- Exploratory analysis carried out
- Data processing performed, including justification of how any missingness or unusual observations are handled
- Modelling techniques tried, including any tuning or variable choices
- Final model selected, and why
For Exercise 2, also provide a few concluding sentences about what differentiates the wheat varieties. You may support your conclusions with a table or figure.
Submission Rules
- Start from the supplied test file.
- Keep the row order unchanged.
- Do not alter the supplied predictor columns.
- Add exactly one prediction column and no extra columns.
- For Exercise 1, name the prediction column
.pred - For Exercise 2, name the classification column
.pred_class - For Exercise 2, submit class labels, not probabilities.
- Use only
property_train_val.csvandwheat_train_val.csvfor model selection. - Do not use
property_test.csvorwheat_test.csvto choose models.
Exercise 1: Property Value Prediction [25 marks]
This exercise concerns residential property values in New Taipei City, Taiwan.
Use property_train_val.csv to build one or more prediction models for price_per_unit, then predict price_per_unit for every row in property_test.csv.
property_train_val.csv contains 310 labelled observations. property_test.csv contains 104 unlabelled observations.
| Variable | Description |
|---|---|
| sale_time | Decimal year of sale |
| property_age | Property age in years |
| transit_distance | Distance to the nearest rapid-transit station |
| nearby_stores | Number of nearby convenience stores |
| latitude | Latitude |
| longitude | Longitude |
| price_per_unit | Property value per unit area, available only in property_train_val.csv |
Submit ex1_studentid.csv as the supplied property_test.csv data with one extra column named .pred.
Exercise 2: Wheat Variety Classification [25 marks]
This exercise concerns wheat kernels from three varieties.
Use wheat_train_val.csv to build one or more classification models for variety, then classify every row in wheat_test.csv.
wheat_train_val.csv contains 156 labelled observations. wheat_test.csv contains 54 unlabelled observations.
| Variable | Description |
|---|---|
| kernel_area | Kernel area |
| kernel_perimeter | Kernel perimeter |
| kernel_compactness | Compactness measure |
| kernel_length | Kernel length |
| kernel_width | Kernel width |
| kernel_asymmetry | Asymmetry measure |
| groove_length | Groove length |
| variety | Wheat variety, available only in wheat_train_val.csv |
Besides methodology description, provide a paragraph concluding what differentiates the wheat varieties. You may support your conclusions with a table or figure. This part is marked out of 5 marks.
Submit ex2_studentid.csv as the supplied wheat_test.csv data with one extra column named .pred_class.
Do You Need Custom-Written Solution For 161.777 Practical Data Mining Assignment Project
Hire NZ Native Experts 24/7.
If you are finding it challenging to complete your 161.777 Practical Data Mining assignment project at Massey University—especially with model building, RMSE optimisation, or classification accuracy—you’re not alone. Many students prefer NZ Assignment Help for dependable academic support. Our experts deliver assignment help aligned with your course requirements. You can also check our massey university assignment questions for better clarity. Get started today with our do my assignment for me service and receive a fully customised, human-written, plagiarism-free solution.
- 115.112 Accounting for Business Assessment 3, 2026 | Massey University
- BUSINESS 114 Accounting for Decision Making Assignment 1, 2026 | UOA
- LAWS621 Public Law Written Assignment Semester 1, 2026 | AUT
- LAWS605 Trusts Law Written Assignment Semester 1, 2026 | AUT
- HUMS5003 Ko Au Tēnei Assessment 2 Brief 2026 | Waikato Institute of Technology
- MARK533 Marketing Communications Written Assignment 2026 | VUW
- CQS103 Construction Planning & Construction Methodology Assessment 1, 2026 | Open Polytechnic
- LES501 Introduction to the Legal System Assessment 1, 2026 | Open Polytechnic
- BSYS701 Enterprise Information Systems Individual Assignment 2, 2026 | AUT
- 566.409 Web Design Foundations Assignment Brief 2026 | Manukau Institute Of Technology

