University of Arizona, Department of Computer Science

CSc 120: Sequence-set Similarity

Expected Behavior

Write a Python function seq_set_sim(seq_set1, seq_set2, k) that takes as arguments two sets of strings seq_set1 and seq_set2 and an integer k, and returns a floating point value between 0 and 1 (inclusive) giving the similarity between the sets of strings seq_set1 and seq_set2. Compute the similarity value as follows:

You can use the code from the previous short problems as helper functions for this problem.

You can assume that seq_set1 and seq_set2 are both non-empty and that the strings in these sets all have length at least k.

Examples

  1. Call: seq_set_sim(set(['aaaa','aabb']), set(['aaab']), 3)
    Return value: 0.5

  2. Call: seq_set_sim(set(['aaabba','aabbcc']), set(['aaab','abbc']), 4)
    Return value: 0.3333333333333333

  3. Call: seq_set_sim(set(['aaabba','abbc']), set(['aaab','aabbcc']), 2)
    Return value: 0.6

  4. Call: seq_set_sim(set(['ababab','acacac']),set(['bababa','cacaca']), 3)
    Return value: 1.0

  5. Call: seq_set_sim(set(['abbbbba','bcccccb']), set(['aaaaab','aaaaac']), 3)
    Return value: 0.0