diff --git a/README.md b/README.md index a2a285a4e9e490f35949db315c8f39ff0245fe3c..dbfa17ffb6e985015676e8f8466b6bd5e4a15f38 100644 --- a/README.md +++ b/README.md @@ -3,8 +3,13 @@ ## Name Evaluation of generative AI using human judgment and automatic metrics -## Description +## Project Description 3 Models were used to generate texts: GPT2, OPT and GPT4o. Text were generated in 3 categories: Poems, science-related topics and sport summaries. Similar prompts were used on all the models. In a survey these texts were compared to human written texts. The poems were gathered from PoetryFoundation dataset, science-related texts were gathered from wikipedia and sport summaries are taken from BBC sports. Participants were asked to identify the text generated by the LLM and rate both the human and LLM generated texts on 4 parameters: Coherence, Conciseness, Creativity and Clarity of Concept. Creativity was only asked for the poems and Clarity of Concept only for the science-related texts. For the automatic metrics FRE, PMI, TF-IDF and TTR were used. + +## Folder Descriptions +The "Data" folder contains .txt files with the Outputs created by the models used for this project togheter with the prompts. Aswell as the extracted human texts. Inside the .txt files some lines are marked with an "X". This is a marker for the text which was used in the survey. +The "Results" folder contains the unproccessed survey data as a .csv file and the processed data in form of .png files. +The "src" folder contains all the code used for this project.