Cryptogram solver
Newspaper puzzles beware. I have been working on soving newspaper cryptograms for the past few days in my free time. It is a bit more interesting of a problem to solve than sudoku I think. The algorithm I've implementing is nothing alltogether impressive and it relies on a good dictionary file to get solutions.
The algorithm works like so:
First you need an index which is a map of abstract words to lists of words.
An abstract word is taking a word and turning it into a pattern, for instance, 'cat' has the pattern '123', as does 'hat', and 'fat'. 'mom' has the pattern '121'.
Then it takes the sentence you have given it, splits it on spaces.
Popping the first word off the list, finds the list in the index of possible words it could be, iterates over it generating a map of letters to letters and recurses on the rest of the sentence, once it has reached the end of the sentence it puts the map it has generated into a list of possible solutions. If a generated map for a word conflicts with teh current map it is not a valid solution and moves onto the next possible solution.
The solve function returns a list of solution maps that can eb applied to any sentence to get the result.
Little things that make it helpful include being able to give an intial map, if you are sure some letters map to other letters you can give that as a hint. It can also remove any words whos pattern does not appear in the map.
Todo:
Solve only those subsets of words that, if solved, will result in the entire alphabet being solved. It should do this for all possible combinations of words in teh sentence that will result in this. This is not an optimizations, it should probably make it take longer to solve actually, however it allows it to solve sentences with words that might not exist in the dictionary but could come about as a result of solving the other words. I have a few ideas of how to do this but havn't had time to implement it yet.
The current code can be downloaded here.
The algorithm works like so:
First you need an index which is a map of abstract words to lists of words.
An abstract word is taking a word and turning it into a pattern, for instance, 'cat' has the pattern '123', as does 'hat', and 'fat'. 'mom' has the pattern '121'.
Then it takes the sentence you have given it, splits it on spaces.
Popping the first word off the list, finds the list in the index of possible words it could be, iterates over it generating a map of letters to letters and recurses on the rest of the sentence, once it has reached the end of the sentence it puts the map it has generated into a list of possible solutions. If a generated map for a word conflicts with teh current map it is not a valid solution and moves onto the next possible solution.
The solve function returns a list of solution maps that can eb applied to any sentence to get the result.
Little things that make it helpful include being able to give an intial map, if you are sure some letters map to other letters you can give that as a hint. It can also remove any words whos pattern does not appear in the map.
Todo:
Solve only those subsets of words that, if solved, will result in the entire alphabet being solved. It should do this for all possible combinations of words in teh sentence that will result in this. This is not an optimizations, it should probably make it take longer to solve actually, however it allows it to solve sentences with words that might not exist in the dictionary but could come about as a result of solving the other words. I have a few ideas of how to do this but havn't had time to implement it yet.
The current code can be downloaded here.