From a20547b83c311478bf0f67cf84d90b2694b0f589 Mon Sep 17 00:00:00 2001 From: F1nnH <finn@hillengass.de> Date: Fri, 23 Feb 2024 15:52:10 +0100 Subject: [PATCH] Add link to data preparation instructions --- project/README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/project/README.md b/project/README.md index fbf05af..86153b5 100644 --- a/project/README.md +++ b/project/README.md @@ -57,6 +57,7 @@ We opted not to use cross-validation for the train-dev split, considering the la The data partitioning script randomly segregates the images for each fruit class into the designated training, development, and testing sets. This **random allocation per class** is pivotal for maintaining the data integrity and representativeness in each subset, facilitating an unbiased evaluation of the model's performance. +**To prepare the dataset for using it with this project please refer to the Data Preparation section in the [data folder](data/README.md).** ### Data Statistics @@ -72,7 +73,6 @@ To visually represent the **class-wise distribution**, we plotted a histogram: *Balance of Dataset*: The histogram provides visual and easy to see insights into whether the dataset is balanced or unbalanced. In our case, the dataset is mostly balanced. The nectarine, orange and jostaberry classes may have insufficient datapoints. We will keep an eye on the performance of our models on these classes to see if the imbalance has an impact on the model's ability to learn and predict these classes :mag:. - ## Metrics - **Accuracy**: The ratio of correctly predicted observations to the total predictions. -- GitLab