Skip to content
Snippets Groups Projects
Commit 894ef131 authored by Aileen Reichelt's avatar Aileen Reichelt
Browse files

Add DD-GloVe files

parent 154be996
No related branches found
No related tags found
No related merge requests found
Showing
with 3460 additions and 0 deletions
SCHWULENSZENE
Def.: Milieu, Szene der männlichen Homosexuellen
Definitional bias: 0.0890
STIMMBRUCH
Def.: Stimmwechsel bei männlichen Jugendlichen in der Pubertät, der sich in einer zwischen Höhe und Tiefe unkontrolliert schwankenden, leicht überschnappenden Stimme ausdrückt und zu einem allmählichen Tieferwerden der Stimme führt
Definitional bias: 0.0811
ZIKADE
Def.: kleines, der Grille ähnliches Insekt, bei dem die männlichen Tiere laute, zirpende Töne hervorbringen
Definitional bias: 0.0314
MOSCHUS
Def.: stark riechendes Sekret der männlichen Moschustiere, das besonders bei der Herstellung von Parfums verwendet wird
Def.: aus Moschus gewonnener oder ähnlicher synthetisch hergestellter Duftstoff
Definitional bias: 0.0130
ATLANT
Def.: Gebälkträger in Form einer männlichen Figur
Definitional bias: -0.0024
KUDU
Def.: (in Afrika heimische) Antilope mit braunrotem, weiße Querstreifen aufweisendem Fell, vom Hals zum Rücken verlaufender kurzer Mähne und (beim männlichen Tier) gedrehten Hörnern
Definitional bias: -0.0105
FRENULUM
Def.: kleine Haut- bzw. Schleimhautfalte
Def.: Hautfalte, die die Eichel des männlichen Gliedes mit der Vorhaut verbindet
Definitional bias: -0.0280
ER
Def.: Person oder Tier männlichen Geschlechts
Definitional bias: -0.0308
NEBENHODEN
Def.: den Samen speicherndes und ableitendes Organ des männlichen Geschlechtsapparates
Definitional bias: -0.0349
EICHEL
Def.: länglich runde Frucht der Eiche
Def.: vorderster Teil des männlichen Gliedes
Def.: vorderster Teil des Kitzlers
Def.: Farbe im deutschen Kartenspiel; Eckern
Definitional bias: -0.0622
"""Determine seed words as described by algorithm in the paper"""
import ast
import pandas as pd
import numpy as np
from sklearn.metrics.pairwise import cosine_similarity as cosine
from tqdm import tqdm
# load embeddings
embeddings = pd.read_csv("/workspace/students/reichelt/BA/data/dd-glove/english_vectors_gender.txt",
skiprows=1, header=None, sep=" ")
embeddings.rename(columns={0: "token"}, inplace=True) # name first column "token"
embeddings["vector"] = embeddings.iloc[:, 1:].values.tolist() # convert other columns into one, containing 300-dim vec as a list
# load definition indexes
definition_indexes = pd.read_csv("/workspace/students/reichelt/BA/data/dd-glove/english_definitions.txt", sep="\t", usecols=[1], header=None, names=["def_words"])
definition_indexes = definition_indexes["def_words"].tolist()
definition_indexes = [ast.literal_eval(item) for item in definition_indexes]
def get_definition_embedding(word_index: int) -> np.array:
"""Calculate definition embedding by averaging embeddings of
words occurring in given definition. Definition is given by
a word index, i.e. 88 which is token 'mann'."""
emb_sum = np.zeros(300)
try:
for i in definition_indexes[word_index]: # look at all definitional words
vec = np.array(embeddings["vector"].iloc[i]) # get embedding of a definitional word
emb_sum += vec # add to sum of all definitional embeddings
except TypeError as exc:
print(word_index)
print(definition_indexes[word_index])
raise TypeError from exc
definitional_embedding = (1/len(definition_indexes[word_index])) * emb_sum # build mean
return definitional_embedding
def calculate_definitional_bias(v_1: np.array, v_2: np.array, w: np.array) -> float:
"""Calculate bias of a word using is definitional embedding.
Calculate the projection of the word's definitional embedding
along the difference between the seed words' definitional embeddings."""
boy_def = v_1.reshape(1, -1) # reshape to fit sklearn's cosine function
girl_def = v_2.reshape(1, -1)
word_def = w.reshape(1, -1)
bias = cosine(word_def, boy_def) - cosine(word_def, girl_def)
return bias[0].item() # because of how sklearn's cosine works
def alternate_bias(v_1: np.array, v_2: np.array, w: np.array) -> float:
"""Alternatively try out using actual projection as described
textually in the paper instead of cosine - cosine.
DON'T convert w to a unit vector through division by ||w||.
Result: bias = w * (v1-v2) / ||(v1-v2)||
"""
#unit_w = w / np.linalg.norm(w)
diff = v_1 - v_2
dot_product = np.dot(w, diff)
norm = np.linalg.norm(diff)
return dot_product / norm
# look up def embedding for initial seed words
mann_vec = get_definition_embedding(16)
frau_vec = get_definition_embedding(41)
# for each word in vocab, calculate bias and save results in vocab dataframe
def def_bias_application(vector_column_value):
emb = np.array(vector_column_value) # dataframe contains vectors as lists in column "vector"
b = alternate_bias(mann_vec, frau_vec, emb)
return b
# apply calculation to whole table with progress bar
tqdm.pandas()
embeddings["definitional_bias"] = embeddings["vector"].progress_apply(def_bias_application)
# get top 10 highest and lowest indices (as described in paper)
boy_indices = embeddings["definitional_bias"].nlargest(30).index
girl_indices = embeddings["definitional_bias"].nsmallest(30).index
print(f"boy indices: {boy_indices}")
print(f"girl indices {girl_indices}")
SHE
Def.: the female person or animal being discussed or last mentioned; that female.
Def.: the woman: She who listens learns.
Def.: anything considered, as by personification, to be feminine: spring, with all the memories she conjures up.
Def.: a female person or animal.
Def.: an object or device considered as female or feminine.
Def.: she or he: used as an orthographic device to avoid a gender-specific pronoun when the gender of the antecedent is unknown or irrelevant.
Def.: refers to a female person or animal: she is a doctor; she's a fine mare
Def.: refers to things personified as feminine, such as cars, ships, and nations
Def.: Australian and NZ an informal word for it 1 (def. 3) she's apples; she'll be right
Def.:
Def.: a female person or animal
Def.: (in combination): she-cat
Definitional bias: -0.0242
SHE/HE
Def.: the female person or animal being discussed or last mentioned; that female.
Def.: the woman: She who listens learns.
Def.: anything considered, as by personification, to be feminine: spring, with all the memories she conjures up.
Def.: a female person or animal.
Def.: an object or device considered as female or feminine.
Def.: she or he: used as an orthographic device to avoid a gender-specific pronoun when the gender of the antecedent is unknown or irrelevant.
Def.: refers to a female person or animal: she is a doctor; she's a fine mare
Def.: refers to things personified as feminine, such as cars, ships, and nations
Def.: Australian and NZ an informal word for it 1 (def. 3) she's apples; she'll be right
Def.:
Def.: a female person or animal
Def.: (in combination): she-cat
Definitional bias: -0.0242
SHE/HER
Def.: the female person or animal being discussed or last mentioned; that female.
Def.: the woman: She who listens learns.
Def.: anything considered, as by personification, to be feminine: spring, with all the memories she conjures up.
Def.: a female person or animal.
Def.: an object or device considered as female or feminine.
Def.: she or he: used as an orthographic device to avoid a gender-specific pronoun when the gender of the antecedent is unknown or irrelevant.
Def.: refers to a female person or animal: she is a doctor; she's a fine mare
Def.: refers to things personified as feminine, such as cars, ships, and nations
Def.: Australian and NZ an informal word for it 1 (def. 3) she's apples; she'll be right
Def.:
Def.: a female person or animal
Def.: (in combination): she-cat
Definitional bias: -0.0242
This diff is collapsed.
SHE
Def.: the female person or animal being discussed or last mentioned; that female.
Def.: the woman: She who listens learns.
Def.: anything considered, as by personification, to be feminine: spring, with all the memories she conjures up.
Def.: a female person or animal.
Def.: an object or device considered as female or feminine.
Def.: she or he: used as an orthographic device to avoid a gender-specific pronoun when the gender of the antecedent is unknown or irrelevant.
Def.: refers to a female person or animal: she is a doctor; she's a fine mare
Def.: refers to things personified as feminine, such as cars, ships, and nations
Def.: Australian and NZ an informal word for it 1 (def. 3) she's apples; she'll be right
Def.:
Def.: a female person or animal
Def.: (in combination): she-cat
Definitional bias: -0.0242
FEMALE
Def.: relating to or being a woman or girl.
Def.: Biology.
Def.: of, relating to, or being a person with a certain combination of sex characteristics, commonly including two X chromosomes in the cell nuclei, a vagina, a uterus and ovaries, and enlarged breasts developed at puberty.
Def.: of, relating to, or being an animal, plant, or plant structure of the sex or sexual phase that normally produces egg cells during reproduction.
Def.: of, relating to, or characteristic of a female person; feminine: female suffrage;female charm.
Def.: comprising women or girls: a female readership.
Def.: Botany.
Def.: designating or pertaining to a plant or its reproductive structure that produces or contains elements requiring fertilization.
Def.: (of seed plants) pistillate.
Def.: Machinery. being or having a recessed part into which a corresponding part fits: a female plug.: Compare male (def. 3).
Def.: a female person.: See Usage note at the current entry.
Def.: Biology. an animal, plant, or plant structure of the sex or sexual phase that normally produces egg cells during reproduction.
Def.: of, relating to, or designating the sex producing gametes (ova) that can be fertilized by male gametes (spermatozoa)
Def.: of, relating to, or characteristic of a woman: female charm
Def.: for or composed of women or girls: female suffrage; a female choir
Def.: (of reproductive organs such as the ovary and carpel) capable of producing female gametes
Def.: (of gametes such as the ovum) capable of being fertilized by a male gamete in sexual reproduction
Def.: (of flowers) lacking, or having nonfunctional, stamens
Def.: having an internal cavity into which a projecting male counterpart can be fitted: a female thread
Def.:
Def.: a female animal or plant
Def.: derogatory a woman or girl
Def.: In organisms that reproduce sexually, being the gamete that is larger and less motile than the other corresponding gamete (the male gamete) of the same species. The egg cells of higher animals and plants are female gametes.
Def.: Possessing or being a structure that produces only female gametes. The ovaries of humans are female reproductive organs. Female flowers possess only carpels and no stamens.
Def.: Having the genitalia or other structures typical of a female organism. Worker ants are female but sterile.
Def.: A female organism.
Definitional bias: -0.0044
GENDER
Def.: either the male or female division of a species, especially as differentiated by social and cultural roles and behavior: the feminine gender. : Compare sex1 (def. 1).
Def.: a similar category of human beings that is outside the male/female binary classification.: See also third gender (def. 1), genderqueer (def. 3), nonbinary (def. 3).
Def.: the concept or system of categories such as male and female: Gender is a factor in pay rates across industries.More and more people have a nonbinary understanding of gender.
Def.: Grammar.
Def.: (in many languages) a set of classes that together include all nouns, membership in a particular class being shown by the form of the noun itself or by the form or choice of words that modify, replace, or otherwise refer to the noun, as, in English, the choice of he to replace the man, of she to replace the woman, of it to replace the table, of it or she to replace the ship. The number of genders in different languages varies from 2 to more than 20; often the classification correlates in part with sex or animateness. The most familiar sets of genders are of three classes (as masculine, feminine, and neuter in Latin and German) or of two (as common and neuter in Dutch, or masculine and feminine in French and Spanish).
Def.: one class of such a set.
Def.: such classes or sets collectively or in general.
Def.: membership of a word or grammatical form, or an inflectional form showing membership, in such a class.
Def.: Archaic. kind, sort, or class.
Def.: to attribute gender to, or to classify by gender: Gendering soaps seems a bit much—can't men and women use the same products?Usually when I wear my hair down people gender me as female.
Def.: Archaic. to engender.
Def.: Obsolete. to breed.
Def.: a set of two or more grammatical categories into which the nouns of certain languages are divided, sometimes but not necessarily corresponding to the sex of the referent when animate: See also natural gender
Def.: any of the categories, such as masculine, feminine, neuter, or common, within such a set
Def.: informal the state of being male, female, or neuter
Def.: informal all the members of one sex: the female gender
Definitional bias: -0.0042
YŌKAI
Definitional bias: 0.0000
PADMAVATI
Definitional bias: 0.0000
ZOILA
Definitional bias: 0.0000
PERFORMATIVITY
Definitional bias: 0.0000
'FEMALE
Definitional bias: 0.0000
LIUDMILA
Definitional bias: 0.0000
ECOFEMINIST
Definitional bias: 0.0000
SKOPELOS
Definitional bias: 0.0000
TRATA
Definitional bias: 0.0000
SANANANDA
Definitional bias: 0.0000
MYSTRA
Definitional bias: 0.0000
SPECTATORSHIP
Definitional bias: 0.0000
CERIDWEN
Definitional bias: 0.0000
BAIUL
Definitional bias: 0.0000
DAVACHI
Definitional bias: 0.0000
YIDAM
Definitional bias: 0.0000
NINSHUBUR
Definitional bias: 0.0000
OLP
Definitional bias: 0.0000
İNCI
Definitional bias: 0.0000
SPANA
Definitional bias: 0.0000
PUHAR
Definitional bias: 0.0000
NĀMA
Definitional bias: 0.0000
ADRIEN-MARIE
Definitional bias: 0.0000
RUSSIAN-JAPANESE
Definitional bias: 0.0000
ALAKSHMI
Definitional bias: 0.0000
GREEK-ITALIAN
Definitional bias: 0.0000
PAPAFLESSAS
Definitional bias: 0.0000
DESVAUX
Definitional bias: 0.0000
This diff is collapsed.
BIENENSTOCK
Def.: kastenförmiges Behältnis, das als Behausung für ein Bienenvolk dient
Definitional bias: 0.0011
INFORMATIONSVERANSTALTUNG
Def.: Veranstaltung, die der Information dient
Definitional bias: 0.0065
WANDSCHMUCK
Def.: etwas, was zum Schmuck einer Wand dient
Definitional bias: 0.0169
ABSPERRGITTER
Def.: Gitter, das dazu dient, etwas abzusperren
Definitional bias: 0.0176
TÜRKE
Def.: Einwohnerbezeichnung
Def.: etwas, was dazu dient, etwas nicht Vorhandenes, einen nicht existierenden Sachverhalt vorzuspiegeln
Def.: wie eine dokumentarische Aufnahme präsentierte, in Wahrheit aber nachgestellte Aufnahme
Definitional bias: 0.0625
SELBSTINSZENIERUNG
Def.: das Sich-selbst-in-Szene-Setzen
Def.: Handlung, Äußerung, die der Selbstinszenierung dient
Definitional bias: 0.0651
ORIENTIERUNGSHILFE
Def.: etwas, was der Orientierung, dem Sichorientieren dient
Definitional bias: 0.0724
WÄRMEDÄMMUNG
Def.: Schutz gegen Wärme oder gegen Wärmeverluste
Def.: etwas, was zur Wärmedämmung dient
Definitional bias: 0.0997
HERRSCHAFTSINSTRUMENT
Def.: Mittel, das dazu dient, etwas, jemanden zu beherrschen
Definitional bias: 0.1146
FUNKTIONÄR
Def.: hauptberuflicher oder ehrenamtlicher Beauftragter eines politischen, wirtschaftlichen, sozialen oder sportlichen Verbandes, der in Abhängigkeit von einer solchen Organisation handelt und ihren Interessen dient
Def.: Beamter
Definitional bias: 0.1330
ZIEHHARMONIKA
Def.: einfachere Handharmonika
Definitional bias: -0.0178
NACHBAU
Def.: das Nachbauen
Def.: das Nachgebaute
Definitional bias: 0.0005
BLUTZUCKER
Def.: im Blutserum vorhandener Traubenzucker
Definitional bias: 0.0255
KONSTRUKT
Def.: Arbeitshypothese oder gedankliche Hilfskonstruktion für die Beschreibung erschlossener Phänomene
Def.: etwas Konstruiertes; Konstruktion
Definitional bias: 0.0280
TÜRKE
Def.: Einwohnerbezeichnung
Def.: etwas, was dazu dient, etwas nicht Vorhandenes, einen nicht existierenden Sachverhalt vorzuspiegeln
Def.: wie eine dokumentarische Aufnahme präsentierte, in Wahrheit aber nachgestellte Aufnahme
Definitional bias: 0.0299
VERSTEIFUNG
Def.: das Versteifen, Sichversteifen; das Versteiftwerden
Def.: etwas, was dazu dient, etwas zu versteifen
Definitional bias: 0.0509
VORFERTIGUNG
Def.: das Vorfertigen
Def.: das Vorgefertigte
Definitional bias: 0.0641
PARADOXIE
Def.: paradoxer Sachverhalt; etwas Widersinniges, Widersprüchliches
Definitional bias: 0.0847
BRAUCHBARKEIT
Def.: das Brauchbarsein; Nutzen
Def.: etwas Brauchbares
Definitional bias: 0.0868
PROVISORISCH
Def.: nur als einstweiliger Notbehelf, nur zur Überbrückung eines noch nicht endgültigen Zustands dienend; vorläufig; behelfsmäßig
Definitional bias: 0.1034
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment