Skip to content
Snippets Groups Projects
Commit b4f1de3e authored by perov's avatar perov
Browse files

add explanation

parent 383af707
No related branches found
No related tags found
No related merge requests found
# Source Explanation:
This file explains the functionality of each script.
## Description
This file explains the functionality of each script.
#### text_extraction.py
Downloads the kaggle Dataset for the poems and using the wikipediaapi extracts the needed Wikipedia articles. The BBC News articles are scraped. Only the first few sentences are extracted to minimize the survey length.
#### models.py
Contains the code where GPT2 and OPT are fine_tuned and prompted for text_generation.
#### compute_metrics.py
Contains the functionality to compute the four metrics (fre, pmi, tf-idf, ttr). PMI and TF-IDF are trained on the Poetry Foundation Dataset (excluding the 9 first instances used for the survey).
#### automatic_prediciton.py
Contains the code to extract the needed Sentences and lines from the .txt files in the "Data" folder. Also contains the behaviour to assign higher/lower (the code just outputs "ai" or "human" for the text that has the higher score) coherence, conciseness, creativity and clarity scores. Also has functionality to predict which text it would pick (human or ai, if output is human between the two it would pick the human one).
##### asses_results.py
Contains the behaviour to extract the data from the survey data that is stored in a .csv file. Main bulk of code is extracting the answers from the .csv file (like how many people guessed the LLM correctly on Section 3 or how long did it take each participant to finish the survey).
#### display_results.py
Uses the previous scripts to collect all the answers and outputs and uses functionality to display them using matplotlib.
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment