Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
B
BA Timeline Summarization
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package Registry
Container Registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
kaiser
BA Timeline Summarization
Commits
df1cc4c8
Commit
df1cc4c8
authored
3 years ago
by
vvye
Browse files
Options
Downloads
Patches
Plain Diff
Remove keyword function
parent
037dba51
No related branches found
Branches containing commit
No related tags found
No related merge requests found
Changes
3
Hide whitespace changes
Inline
Side-by-side
Showing
3 changed files
sentence_selection.py
+0
-3
0 additions, 3 deletions
sentence_selection.py
summarization.py
+1
-1
1 addition, 1 deletion
summarization.py
util.py
+0
-10
0 additions, 10 deletions
util.py
with
1 addition
and
14 deletions
sentence_selection.py
+
0
−
3
View file @
df1cc4c8
from
datetime
import
datetime
,
timedelta
from
sklearn.metrics.pairwise
import
cosine_similarity
import
numpy
as
np
from
scipy
import
sparse
import
util
...
...
@@ -48,8 +47,6 @@ def candidate_sentences(articles, date, vectorizer):
def
sentences_published_on_date
(
articles
,
date
,
tolerance_days
,
num_first_sentences
):
# implementation details are the same as ghalandari et al:
# a sentence is not included in the final list if it also mentions any date at all
sentences
=
[]
for
article
in
articles
:
pub_date
=
datetime
.
strptime
(
article
[
'
pub_date
'
],
'
%Y-%m-%d
'
)
...
...
This diff is collapsed.
Click to expand it.
summarization.py
+
1
−
1
View file @
df1cc4c8
...
...
@@ -41,7 +41,7 @@ def summarize(sentences, vectorizer, keywords, num_sentences):
for
i
in
sorted_indices
:
remaining_indices
.
remove
(
i
)
sentence
=
sentences
[
i
]
if
not
util
.
contains_any
(
sentence
[
'
text
'
],
keywords
):
if
not
any
([
kw
.
lower
()
in
sentence
[
'
text
'
].
lower
()
for
kw
in
keywords
]
):
continue
if
redundant
(
i
,
selected_indices
,
X
):
continue
...
...
This diff is collapsed.
Click to expand it.
util.py
+
0
−
10
View file @
df1cc4c8
import
os
import
re
def
subdirs
(
path
):
...
...
@@ -10,15 +9,6 @@ def files(path, extension=None):
return
[
f
for
f
in
os
.
listdir
(
path
)
if
os
.
path
.
isfile
(
path
/
f
)
and
(
extension
is
None
or
f
.
endswith
(
extension
))]
def
contains_any
(
string
,
keywords
):
for
keyword
in
keywords
:
# following ghalandari, don't account for word boundaries
# if re.search(fr'\b{keyword.lower()}\b', string.lower()):
if
keyword
.
lower
()
in
string
.
lower
():
return
True
return
False
def
avg
(
lst
):
return
sum
(
lst
)
/
len
(
lst
)
...
...
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment