Would turning a Diceware phrase into a sentence decrease its security?

It does not decrease the security. What is actually happening is that your "entropy calculator" is giving you a false measure of entropy. It can only give an approximate estimate, after all. There's actually interesting proofs that show that one can never actually know the amount of entropy in a particular string of text unless you know something about how it was constructed. A pass string 1000 words long created by a "physical random number generator" like a resistor noise network will appear to have the same amount of entropy as a pass string 1000 words long generated using a Mersene Twister, until you realize that the Mersene twister actually leaks all of its seed information in any contiguous block of 624 values. Entropy calculators can only make heuristic assumptions about how random the data actually is.

This, of course, is why we have Diceware. It can prove [an underestimate on] entropy because randomness is built into the process. To prove the security of a pass-sentence like you are looking at, consider an oracle test. I select a bunch of words using Diceware, and then I build a sentence out of them. I then provide you with an oracle which constructs sentences out of them. It is guaranteed that, if you provide the oracle with the correct set of selected words from Diceware, it will provide exactly the sentence I used. For all other sets of words, it will produce an arbitrary sentence using them. It is trivial to see that the entropy of my password cannot possibly be lower than the entropy built into the Diceware words I selected. Even with this immensely powerful oracle to reduce the very human process of sentence formation to nothingness, the randomness from diceware will remain. You cannot guess my password any faster than you could guess the original set of Diceware words I selected.

Now there are a few caveats. If you use fewer diceware words, like your later example, you get fewer bits of entropy from the diceware layer. This means that oracle I mentioned above becomes more and more helpful for breaking the sentence based password. Also, some of the sets of words you get from diceware can be particularly difficult to turn into sentences. If you ever reject a set of diceware words as part of your pass-sentence building process, you are calling into question the perfect randomness that diceware relies on.

Now, why the oracle attack? Oracles are very powerful tools for testing cryptographic theory. In reality, tracy is optically renowned worldwide is actually probably quite a lot stronger than the 38.7 bits from the diceware words tracy optic renown. Breaking that sentence will take more work than the words, though probably not the full 100.504 bits the entropy calculator heuristically estimates. So how much stronger? We don't know. That's the point of oracle attacks. In an oracle attack we say "let's just assume this hard to calculate part of the process offers zero increased security. None at all. Is the process still secure?" If it is secure under this extreme assumption, then it is clearly secure against real life attacks where the attacker doesn't necessarily have such a magically powerful oracle at their disposal.


Assuming you go with whatever words you roll (as opposed to rolling until you find something you can make a sentence out of), and you use them in the order they were rolled (not rearranging them to make a better sentence), this scheme cannot decrease the entropy. It will increase it; but to what extent, is hard to quantify.

Assume the worst case, that the attacker knows you are using this scheme. The entropy of the diceware words is unchanged. If you were using those words alone, the attacker would have to try every tuple of diceware words. But now the attacker has to take each tuple and insert it into one of several possible sentence forms.

The sentence forms that make sense grammatically will vary according to the parts of speech of the diceware words. Some might take multiple parts of speech; "annual" could be an adjective or a noun, for example. It's also possible you might use incorrect grammar on purpose.

So the number of possible passphrases has increased, and so has the entropy. However, since the amount of entropy increase is difficult to quantify, I would assume it's zero, and use as many diceware words as you would use without this scheme. The quantifiable advantage is that the expanded phrase is easier to memorize.


Dice ware has its security in the number of bits of entropy per word. We'll start with the assumption that you've selected the words at random with a particular order.

If you add words in between to turn it into a sentence, it still has the same entropy and is therefore just as secure. (there is a possible issue here if you include your added words in the password you enter in the form on the computer because they may provide hints to somebody who has cracked half the password' if your words are "pizza eat", you could memorize it by "pizza is what I eat," but if you enter that whole thing and somebody figures out "pizza is what I..." Then it's not hard to guess "eat" but if you only type "pizzaeat" then you don't have that issue)

If you re-arrange the words though, you decrease its security because the number of options per word is smaller (because you artificially limit the number of options for each word to the ones which would work well with the following word. For example, you eliminate options like "pizza eat" because they wouldn't make sense, and therefore the number of options for the first word is smaller because "pizza" is no longer one of them).