Chemistry - Does IUPAC nomenclature have the ability to name all organic compounds?

Solution 1:

Definitely not.

You got yourself in trouble specifying all organic compounds, because there is a truly immense, mind-boggling number of possible compounds. No one even knows how to accurately determine such a quantity. A very rough estimate, making some incredible simplifications such as the use of only carbon, hydrogen, oxygen, nitrogen and sulfur atoms, is that there are some $10^{63}$ "reasonable" unique structures for compounds with a molecular weight below $\rm{500\ g\ mol^{-1}}$. Thanks to combinatorics, chemical space is enormous. We (or all sapient species in our observable volume, for that matter) will never come close to scratching its surface.

Furthermore, IUPAC nomenclature is largely created a posteriori. That is, though there are many rules trying to cover as many bases as possible (in the process becoming quite unwieldy at times), eventually some unexpected compound with unusual connectivity is discovered and becomes of wide interest. Thus standardising its nomenclature and that for closely related structures becomes fundamental to allow communication between scientists. A recent example of this occurred with fullerenes, which quickly jumped to prominence after 1985. IUPAC just had to create an entire new section of nomenclature for this class of compounds, which is not at all uncommon.

The closest thing to an absolute method of describing a compound's structure is to have a table of positional data (XYZ coordinates) giving the relative positions between the atoms determined from X-ray/neutron diffraction. Any attempt at simplifying this data will be lossy, whether the structures are drawn (not too lossy) or named (very lossy).

The structures you show have comparatively simple IUPAC names, in fact. Heme is a type of porphyrin, which is a widely occurring framework in biomolecules. The central framework can have its positions numbered and the substituent in each one read off separately. Regarding Zantac, the Wikipedia page for the compound states its IUPAC name in the "Identifier" section in the box at the right, namely N-(2-[(5-[(dimethylamino)methyl]furan-2-yl)methylthio]ethyl)-N'-methyl-2-nitroethene-1,1-diamine.

As some interesting examples of the relations between available nomenclature and chemical space, consider the following:

  • 1,1,1,2,2,2-Hexaphenylethane : A molecule with a simple structure and a simple IUPAC name which likely cannot exist in reasonable conditions.

  • Maitotoxin : an awe-inspiring biomolecule with a rather large structure but containing fairly simple connectivity between atoms, whose IUPAC exists but is quite complex - disodium (2S,3R,4R,4aS,5aR,6aS,7aS,8R,9R,10R,11aR,12R,12aR,13aS,14aR)-10-[(2R,3R,4R,4aS,6S,7R,8aS)-6-[(1R,3R)-4-[(2S,3R,4R,4aS,6R,7R,8aS)-6-[(1R,3S,5R,7S,9R,10R,12R,13S,14S,16R,19S,21R,23S,25S,28R,30S)-25-[(1S,3R,5S,7R,9S,11S,14R,16S,18R,20S,21Z,24R,26S,28R,30S,32R,34R,35R,37S,39R,42S,44R)-11-[(1S,2R,4R,5S)-1,2-dihydroxy-4,5-dimethyloct-7-en-1-yl]-35-hydroxy-14,16,18,32,34,39,42,44-octamethyl-2,6,10,15,19,25,29,33,38,43-decaoxadecacyclo[22.21.0.0³,²⁰.0⁵,¹⁸.0⁷,¹⁶.0⁹,¹⁴.0²⁶,⁴⁴.0²⁸,⁴².0³⁰,³⁹.0³²,³⁷]pentatetracont-21-en-34-yl]-9,13-dihydroxy-3,7,14,19,30-pentamethyl-2,6,11,15,20,24,29-heptaoxaheptacyclo[17.12.0.0³,¹⁶.0⁵,¹⁴.0⁷,¹².0²¹,³⁰.0²³,²⁸]hentriacontan-10-yl]-3,4,7-trihydroxy-octahydropyrano[3,2-b]pyran-2-yl]-1,3-dihydroxybutyl]-3,4,7-trihydroxy-octahydropyrano[3,2-b]pyran-2-yl]-2-[(2S,3R)-2,3-dihydroxy-3-[(1S,3R,5S,6S,7R,8S,10R,11R,13S,15R,17S,19R,21R,22S,24S,25S,26R)-6,7,11,21,25-pentahydroxy-13,17-dimethyl-8-[(2R,3R,4R,7S,8R,9R,11R,13E)-3,8,11,15-tetrahydroxy-4,9,13-trimethyl-12-methylidene-7-(sulfonatooxy)pentadec-13-en-2-yl]-4,9,14,18,23,27-hexaoxahexacyclo[13.12.0.0³,¹³.0⁵,¹⁰.0¹⁷,²⁶.0¹⁹,²⁴]heptacosan-22-yl]propyl]-4,8,9,12-tetrahydroxy-hexadecahydro-2H-1,5,7,11,13-pentaoxapentacen-3-yl sulfate

  • This hydrocarbon : A molecule which likely exists, with a seemingly very simple structure, but with slightly quirky connectivity which makes naming it a challenge. Add a few more bridges and I'm sure you can break any existent nomenclature rules.

Solution 2:

Interesting question. Let's start with your examples.

You probably meant real heme, fixed:

heme structure fixed

"Heme" is recognized name in IUPAC tetrapyrrole nomenclature.

But it can also be named as (protoporphyrinato)iron(II).

Protoporphyrin can be named as
3,7,12,17-tetramethyl-8,13-divinyl-2,18-porphinedipropionic acid, or
2,18-bis(2-carboxyethyl)-3,7,12,17-tetramethyl-8,13-divinyl-2,18-porphine.

Porphine can be named as 21,22-dihydroporphyrin.

Porphyrin could be then named systematically as some ugly a,b,c,d-tetraazapentacyclo[X.Y.Z.UV,W.RS,T.OP,Q]tetracosane-i,j,k,...-undecaene (italic letters would be replaced with correct numbers).

How do we put these names together with iron? By using organometallic coordination compound nomenclature. But let us name that organic ligand with somewhat more recent phane nomenclature, that will produce less ugly name, as it is a powerful tool able to make abstractions from the local structures to "superatoms" in a simplified skeleton (can be used nicely to name the local interesting looking organic compound).

heme sketch

So the final name could be
[54,73-bis(2-carboxyethyl)-14,33,53,74-tetramethyl-13,34-divinyl-11H,51H-1,3,5,7(2,5)-tetrapyrrolacyclooctaphane-2(32),4(52),55(6),75(8)-tetraene-N,N′′-diyl-κ4N,N′,N′′,N′′′]iron(II).

I likely have some mistake there, but I can construct the structure according to the rules and check if it is correct, and if not, fix it.

The second example is named by some software in a fraction of a second as (Z)-N1-{2-[({5-[(dimethylamino)methyl]furan-2-yl}methyl)sulfanyl]ethyl}-N′1-methyl-2-nitroethene-1,1-diamine.

So yes, IUPAC nomenclature is constructed for chemists to be able to name pretty much anything. Some strange functional group is present? You can decompose it to building blocks (ad absurdum, you can name -COOH group as hydroxycarbonyl).

However, these rules are constructed by humans, and as far as I know, there is no formal proof that it can name any possible structure. It almost certainly is not (yet) describe e.g. stereochemistry of highly complex molecules with certain topological isomerism, e.g. complicated knots. Is "pyrrol" component name from the heme systematic name too trivial? It is already correct IUPAC name. But you can name it "more" systematically as 1-azacyclopenta-2,4-diene (if I'm not wrong with numbering). On the lowest level, the raw weighted graph of element nodes, the systematic nomenclature probably cannot operate in a practical way. The structural formula is worth a thousand IUPAC words.

Tags:

Nomenclature