Skip to content

Commit

Permalink
Add examples for INDs
Browse files Browse the repository at this point in the history
Add examples for inlusion dependecy detection algorithms
  • Loading branch information
Senichenkov authored and chernishev committed Mar 26, 2024
1 parent ee78975 commit d773897
Show file tree
Hide file tree
Showing 6 changed files with 80 additions and 0 deletions.
9 changes: 9 additions & 0 deletions examples/datasets/ind_datasets/course.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
Course ID,Title,Department name
IT-1,Computer Science,Institute of Information Technology
MM-3,Algebra,Mathematics and Mechanics Faculty
H-1,History,Institute of History
FL-2,English,Faculty of Foreign Languages
IT-2,Programming,Institute of Information Technology
S-5,Philosophy,Faculty of Sociology
P-2,Physics,Faculty of Physics
C-8,Chemistry,Institute of Chemistry
9 changes: 9 additions & 0 deletions examples/datasets/ind_datasets/department.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
Department name,Building
Institute of Information Technology,5 Academic av.
Mathematics and Mechanics Faculty,3 Academic av.
Institute of History,29A University st.
Faculty of Foreign Languages,10 Science sq.
Faculty of Sociology,29C University st.
Faculty of Physics,10 Academic av.
Institute of Chemistry,11 Academic av.
Graduate School of Managemment,49 Science sq.
8 changes: 8 additions & 0 deletions examples/datasets/ind_datasets/instructor.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
ID,Name,Department name,Salary
in1089,Prof. Jones,Mathematics and Mechanics Faculty,$12000
in6723,Dr. Powers,Faculty of Sociology,$8000
in5555,Larry Thompson,Graduate School of Managemment,$5000
in8930,Prof. Burgess,Faculty of Sociology,$11500
in4520,David Stewart,Institute of Chemistry,$5200
in6577,Dr. Holloway,Mathematics and Mechanics Faculty,$9000
in9910,Dr. Rose,Institute of History,$8500
9 changes: 9 additions & 0 deletions examples/datasets/ind_datasets/student.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
ID,Name,Department name
st104726,Darlene Johnson,Institute of Chemistry
st967925,Alice Green,Mathematics and Mechanics Faculty
st760375,Olga Jones,Graduate School of Managemment
st779090,Felix Brown,Faculty of Sociology
st299471,Angela Ramirez,Faculty of Sociology
st887788,Debbie Lewis,Graduate School of Managemment
st679973,Evelyn Obrien,Mathematics and Mechanics Faculty
st897856,Melissa Smith,Institute of Information Technology
6 changes: 6 additions & 0 deletions examples/datasets/ind_datasets/teaches.csv
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Instructor ID,Course ID,Year,Semester
in1089,MM-3,2,Fall
in6723,S-5,1,Spring
in8930,S-5,3,Fall
in4520,C-8,2,Fall
in6577,MM-3,1,Fall
39 changes: 39 additions & 0 deletions examples/mining_ind.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
import desbordante
import csv

def row_to_padded_string(row, widths):
return ''.join(field.ljust(width) for width, field in zip(widths, row))

def print_table(filename: str):
with open(filename, newline='') as table:
rows = list(csv.reader(table, delimiter=','))

column_widths = []
for col_num in range(len(rows[0])):
max_len = max(len(row[col_num]) for row in rows)
column_widths.append(max_len + 3)

header, *data_rows = rows
print(row_to_padded_string(header, column_widths))
print('-' * sum(column_widths))
print('\n'.join(row_to_padded_string(row, column_widths) for row in data_rows))

TABLES = [(f'examples/datasets/ind_datasets/{table_name}.csv', ',', True) for table_name in
['course', 'department', 'instructor', 'student', 'teaches']]

algo = desbordante.ind.algorithms.Default()
algo.load_data(tables=TABLES)
algo.execute()
inds = algo.get_inds()
print('Found inclusion dependencies (-> means "is included in"):\n')
for ind in inds:
print(ind)

print()
print('Tables for first IND:')
print('course.csv:\n')
print_table('examples/datasets/ind_datasets/course.csv')

print()
print('department.csv:\n')
print_table('examples/datasets/ind_datasets/department.csv')

0 comments on commit d773897

Please sign in to comment.