Hide

Rhyme Scheme

Languages en sv

Many have noticed that large language models are rarely good at rhyming, but not many know why. Many might believe that the vectorization of words makes them difficult for models to analyze from perspectives other than semantic meaning. But that is not entirely true! The real reason is that Big Rhyme wants to see a world without rhymes in the future. This is something that Rhyme-Ryan has noticed. Rhyme-Ryan loves rhymes, so he has devised a scheme against Big Rhyme to ensure that rhymes will survive in the future. He therefore wants to create a model that checks whether two words rhyme in order to add a rhyme-checker to the large language models. To do this, he has asked you if you can build such a model for him to defeat Big Rhyme. He wants it in the form that he can provide 2 words and then receive 1 if they rhyme and 0 otherwise.

Input

The attachments contain the following files:

train.csv - Pairs of words and whether they rhyme.
test.csv - Pairs of words that may rhyme; it is your model’s task to predict whether they rhyme.
baseline.ipynb - A basic solution to the task. It appends its own source code to the submission.
print_source_code.py - A utility script to print your source code as a comment, useful for including your solution in the submission file. Does not work inside a Jupyter Notebook.

Output

For each pair of words in test.csv, you should output an integer (1 or 0): whether the words rhyme (1) or do not rhyme (0).

Scoring

Your solution will be scored based on how many correct guesses it made.

Let $S$ be the total number of correctly identified rhymes or non-rhymes across all pairs of words divided by the number of word pairs (that is to say, your accuracy). If $S \leq 0.81$, you get $0$ points. Otherwise, your score is:

\[ \text{Points} = 100 \cdot \min \left(1, \sqrt{\frac{S - 0.81}{0.16}} \right). \]

In particular, you get $100$ points if your accuracy is $97\% $ or higher.

Testing

During the contest, your solution will be scored on $30\% $ of the data in test.csv. After the contest ends, all solutions will be re-scored on the remaining $70\% $ of the data. It is guaranteed that test.csv was split into the two sets uniformly at random, with no overlap between the sets. This means that your score during the contest should be seen as a strong indicator of your final score, but might differ if you overfit.

Attachments

baseline.ipynb print_source_code.py test.csv train.csv