From b4f1de3e95d9320c60a74266b72d4b49b6fad809 Mon Sep 17 00:00:00 2001 From: perov <perov@cl.uni-heidelberg.de> Date: Fri, 28 Mar 2025 22:48:02 +0000 Subject: [PATCH] add explanation --- src/README.md | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/src/README.md b/src/README.md index ba8a6a2..e904471 100644 --- a/src/README.md +++ b/src/README.md @@ -1,4 +1,21 @@ # Source Explanation: +This file explains the functionality of each script. ## Description -This file explains the functionality of each script. +#### text_extraction.py +Downloads the kaggle Dataset for the poems and using the wikipediaapi extracts the needed Wikipedia articles. The BBC News articles are scraped. Only the first few sentences are extracted to minimize the survey length. + +#### models.py +Contains the code where GPT2 and OPT are fine_tuned and prompted for text_generation. + +#### compute_metrics.py +Contains the functionality to compute the four metrics (fre, pmi, tf-idf, ttr). PMI and TF-IDF are trained on the Poetry Foundation Dataset (excluding the 9 first instances used for the survey). + +#### automatic_prediciton.py +Contains the code to extract the needed Sentences and lines from the .txt files in the "Data" folder. Also contains the behaviour to assign higher/lower (the code just outputs "ai" or "human" for the text that has the higher score) coherence, conciseness, creativity and clarity scores. Also has functionality to predict which text it would pick (human or ai, if output is human between the two it would pick the human one). + +##### asses_results.py +Contains the behaviour to extract the data from the survey data that is stored in a .csv file. Main bulk of code is extracting the answers from the .csv file (like how many people guessed the LLM correctly on Section 3 or how long did it take each participant to finish the survey). + +#### display_results.py +Uses the previous scripts to collect all the answers and outputs and uses functionality to display them using matplotlib. \ No newline at end of file -- GitLab