# Chemistry - How to identify hydrogen bonds and other non-covalent interactions from structure considerations?

## Solution 1:

It is safe to say that there will always be intermolecular forces at play. At the time where you will consider these you should already have a good idea about the molecules involved in your system.
Based on the composition and molecular structures you can make certain assumptions. In a molecule it is straight forward to estimate (bond) polarities based on electronegativities, then infer from these how they might arrange. I will work an example based on the interactions between adenine and thymine later.[1]

With the advent of the information age, tools that every chemist has at her or his disposal have become more sophisticated. We have access to many digital resources like databases,[2] or publication servers[3] to retrieve a vast amount of information. Molecular modelling,[4] or even more sophisticated quantum chemical calculations,[5] have become more important; and free tools are available for everyone to use.

With that being said, here are some points that might help you identifying hydrogen bonds and other non-covalent interactions. For that purpose, let's have a look at our example molecules:

Immediately we can formulate a couple of assumptions based on the schematic representation. In adenine there are five nitrogen atoms, which have a higher electronegativity than carbon. Negative partial charges will therefore be located at mostly at these. Hydrogens in organic compounds usually carry positive partial charges; albeit $$\ce{C-H}$$ tend to be a lot less polarised than $$\ce{N-H}$$ bonds. Whenever a hydrogen is involved in a bond, that bond can potentially act as a hydrogen-bond donor (see below).
Similar observations can be made for thymine. Here we have two terminal oxygen atoms, which will carry a negative partial charge, since they have one of the highest electronegativities. These are often able to accept hydrogen bonds. On the other hand we also have $$\ce{N-H}$$ bonds, which can act as hydrogen bond donors.

## Charges & Electrostatic Potential Surfaces

For many molecules structures are readily available. If not, some molecular editors give you the possibility to use implemented force fields to optimise built (guessed) structures. Based on those you can already do a few analyses. One tool that is quite powerful for various tasks is Avogadro, it let's you read crystal structures, perform basic calculations and much more. If you are just playing around, this is a really good choice.
For example, I have imported the crystal structure of adenine into Avogadro, optimised it, and calculated the electrostatic potential. Or after extracting Cartesian coordinates, Molden let's you easily calculate the charges.[6]

## Hydrogen Bonds

Many molecular editors try to guess hydrogen bonds based on their implemented cutoff values. That certainly is very helpful, but not everything can be automatised in this way. And especially weaker interactions won't be found. One has to go a bit deeper then.
As a nice and concise example I have picked an intermolecular 2:1 complex between adenine and thymine, for which the crystal structure is available.[1]

There are two principle structural parameters to decide about hydrogen bonds: (a) The distance of the hydrogen $$\ce{H}$$ and the hydrogen-bond acceptor $$\ce{Y}$$ is significantly shorter than the sum of their respective van-der-Waals radii, $$d(\ce{XH\bond{...}Y}).[7] (b) The angle around the hydrogen is nearly linear, $$\angle(\ce{XH\bond{...}Y}) \approx 180^\circ$$. For weakly polarised $$\ce{XH}$$ bonds, isotropic dispersion forces become more important (while the directional electrostatic and covalent contributions decrease), therefore the angle becomes more flexible.

We easily see that the bond angles are close to what we expect for hydrogen bonds. I have reproduced a few values from Batsanov's paper below, with the caveat that the value for hydrogen strongly varies depending on the chemical environment from $$\pu{110 - 161 pm}$$, so I used the classic from Bondi.[7c] Since all the distance are around $$\pu{200 pm}$$, they are well below the threshold we set earlier.

$$\begin{array}{lr} \text{Element }\ce{Y} & r_\mathrm{vdW}(\ce{Y})/\pu{pm}\\\hline \ce{H} & \approx 120\\ \ce{C} & 196 \\ \ce{N} & 179 \\ \ce{O} & 171 \\\hline \end{array}\hspace{2ex} \begin{array}{lr}\\ \text{H-Bond }\ce{XH\bond{...}Y} & \sum r_\mathrm{vdW}(\ce{Y},\ce{H})/\pu{pm}\\\hline \ce{CH} & 316 \\ \ce{NH} & 299 \\ \ce{OH} & 291 \\\hline \end{array}$$

A quite interesting approach of revealing non-covalent interactions was presented by Johnson et. al., and the corresponding program is easy to use and only requires Cartesian coordinates.[8]

The surfaces between the molecules represent these interactions, where green represents weak interactions, typically found for dispersion. Blue represents stronger interactions, typically found for hydrogen-bonds. Red displays repulsive forces, typically found within ring or cage systems.

If you have access to quantum chemical software, then you can obtain this plot also for wave function files .wfn.

Another possibility is to analyse the electron density in terms of the quantum theory of atoms in molecules (QTAIM).[9] For this you do need a wave function file. The analysis, however, is straight forward and will yield a bond path or not. If there is a bond path, we can estimate the strengths of these bond with the methodology developed by Espinosa et. al.. According to this the bondstrength is approximately half the value of the potential energy density at the bond critical point. $$E_\mathrm{H-Bond} = \frac{1}{2}V(r_{\mathrm{BCP}[\ce{XH\bond{...}Y}]})$$

I have performed such a calculation on the DF-B97D3/def2-TZVPP level of theory with Gaussian 09. The optimised geometry will be at the end.

$$\begin{array}{lr} \text{H-Bond} & E_\mathrm{H-Bond}/\pu{kJ mol-1}\\\hline \mathrm{N(37)H \cdots O(1)} & 46.6\\ \mathrm{N(5)H \cdots N(36)} & 38.5\\ \mathrm{C(41)H \cdots O(2)} & 3.2\\ \mathrm{N(20)H \cdots O(2)} & 50.5\\ \mathrm{N(3)H \cdots N(24)} & 29.9\\ \end{array}$$

A general warning shall be applied to the above. Absolute values of these are only approximate, but fall within the range of what is expected. A very nice side effect of this methodology is, that it can be applied to intramolecular hydrogen bonds, too.

### Concluding remarks

Dispersive interactions and hydrogen bonds become more and more important in rational reaction design. Be it for understanding of molecular structure of biomolecules, or as a guiding principle for catalyst-substrate interactions. With further development of computer technology, it should become more accessible to everyone. I hope this post demonstrates that gaining more insight can actually be quite easy (and free).

### Notes and References

1. (a) Based on the structure from S. Chandrasekhar, T. R. Naik, S. K. Nayak, T. N. Row, Bioorg. Med. Chem. Lett. 2010, 20 (12), 3530-3533. DOI: 10.1016/j.bmcl.2010.04.131 PMID: 20493694 CSD: 739016 (b) Adenine, CSID: 185 (c) S. Mahapatra, S. K. Nayak, S. J. Prathapa, T. N. Guru Row, Cryst. Growth Des. 2008, 8 (4), 1223–1225. DOI: 10.1021/cg700743w, CSD: 652573 (d) Thymine, CSID: 1103 (e) G. Portalone, L. Bencivenni, M. Colapietro, A. Pieretti, F. Ramondo, Acta Chemica Scand. 1999, 53, 57-68. DOI: 10.3891/acta.chem.scand.53-0057, CSD: 136916

2. (a) The Cambridge Structural Database (CSD), https://www.ccdc.cam.ac.uk/ (b) Crystallography Open Database (COD), http://www.crystallography.net/cod/ (c) Computational Chemistry Comparison and Benchmark DataBase, http://cccbdb.nist.gov/ (Only for 1799 small molecules and atoms) (d) Handbook of Chemistry and Physics, http://hbcponline.com/faces/contents/ContentsSearch.xhtml (e) ...

3. (a) SciFinder, https://www.cas.org/products/scifinder (b) Google Scholar, https://scholar.google.de/ (c) Web of Science, (formerly known as Web of Knowledge) http://www.webofknowledge.com/ (d) ...

4. (a) MolCalc, http://molcalc.org/ (b) Pitt Quantum Repository, https://pqr.pitt.edu/ (At the time of writing it was dead.) Github: pittquantum (c) Many open source molecular editors include the possibility to use force field calculations. For example: Avogadro, molden (d) For more on molecular modelling in the open source domain see S. Pirhadi, J. Sunseri, D. R. Koes, J. Mol. Graph. Model. 2016, 69, 127-143. An updated online version of this catalog can be found at https://opensourcemolecularmodeling.github.io.

5. (a) For an extensive, but not necessarily complete, list of quantum chemistry software see Wikipedia. (b) For the purpose of this demonstration I will be using the proprietary software Gaussian. (c) To view crystal structures, Mercury can be obtained (for free) from ​the Cambridge Crystallographic Data Centre (CCDC), which also hosts CSD. https://www.ccdc.cam.ac.uk/solutions/csd-system/components/mercury/

6. (a) Tutorial for Avogadro (b) Tutorial for molden

7. (a) A concise (and as far as I can tell newest) list of van-der-Waals radii of many elements can be found in S. S, Batsanov, Inorg. Mat. 2001, 37, 871-885. DOI: 10.1023/A:1011625728803 (mirrored pdf) (b) A list of van-der-Waals radii can also be found on Wikipedia (c) A. Bondi, J. Phys. Chem. 1964, 68, 441-451. doi: 10.1021/j100785a001

8. (a) The original publication: E. R. Johnson, S Keinan, P. Mori-Sánche§, J. Contreras-García, A. J. Cohen, W. Yang, J. Am. Chem. Soc. 2010, 132, 6498-6506. DOI: 10.1021/ja100936w (b) The presentation of the program: J. Contreras-Garcia, E. R. Johnson, S. Keinan, R. Chaudret, J-P. Piquemal, D. N. Beratan, W. Yang, J. Chem. Theory Comput. 2011, 7, 625-632. DOI: 10.1021/ct100641a (c) Download the code: http://www.lct.jussieu.fr/pagesperso/contrera/nciplot.html (d) You'll also need VMD (Visual Molecular Dynamics) from the University of Illinois

9. (a) A very brief introduction can be found on Wikipedia. The corresponding book: Bader, Richard (1994). Atoms in Molecules: A Quantum Theory. USA: Oxford University Press. ISBN 978-0-19-855865-1. (publisher) (b) Multiwfn - A Multifunctional Wavefunction Analyzer; http://sobereva.com/multiwfn/ corresponding paper: T. Lu, F. Chen, J. Comput. Chem. 2012, 33, 580-592. DOI: 10.1002/jcc.22885 (c) Startup script (and examples) for Linux version: https://github.com/polyluxus/runMultiwfn.bash (shameless self-plug) (d) E. Espinosa, E. Molins and C. Lecomte, Chem. Phys. Lett., 1998, 285, 170–173.

### Appendix

Optimised Structure of the adenine-thymine 2:1 complex calculated at DF-B97D3/def2-TZVPP in Gaussian 09 Rev. E.01

  45
E(RB97D3/def2TZVPP/W06) = -1388.51095169
O         19.03780       11.79565        1.63996
O         14.74303       13.36808        1.71115
N         15.07940       11.09762        1.64043
H         14.04669       10.93403        1.64128
N         16.88238       12.55083        1.67499
H         17.23771       13.53695        1.70257
C         17.83067       11.52964        1.63864
C         17.29332       10.17534        1.60048
C         15.51719       12.40225        1.67775
C         15.94232       10.03610        1.60357
H         15.46489        9.06076        1.57654
C         18.24710        9.02139        1.56030
H         18.89656        9.08263        0.68030
H         18.90669        9.02997        2.43483
H         17.71085        8.06866        1.53480
N          8.28060       10.09397        1.65692
H          7.68055        9.28386        1.63653
N          8.91986       12.24943        1.71837
N         10.48101        9.01368        1.60709
N         11.91467       12.94472        1.71757
H         12.92612       13.11883        1.71578
H         11.26613       13.71385        1.74609
C         10.03193       11.42408        1.68464
N         12.26612       10.63929        1.64378
C         11.41986       11.70069        1.68284
C          9.65939       10.07380        1.64587
C          7.89700       11.42267        1.70074
H          6.85459       11.71169        1.71748
C         11.75955        9.39245        1.60917
H         12.49648        8.59107        1.57896
N         21.22700       16.46826        1.76840
N         17.31372       17.43807        1.81488
H         16.80939       18.31038        1.84233
N         19.61652       18.27309        1.82815
C         18.69253       17.30751        1.80467
N         17.71219       15.24141        1.74956
N         20.64990       14.21957        1.70686
H         19.98577       13.44513        1.68592
H         21.63800       14.02512        1.69520
C         20.27365       15.50797        1.74532
C         16.77892       16.17095        1.78070
H         15.71578       15.97320        1.77996
C         18.91965       15.92447        1.76371
C         20.85479       17.75435        1.80728
H         21.67003       18.47621        1.82413


## Solution 2:

Besides the computational methods clearly described by @Martin, you can identify hydrogen bonds using a number of experimental methods, many are indirect such as melting and boiling points, more specific are changes in absorption/fluorescence and changes in ir and raman spectra. Microwave spectra can be used in the gas phase.

The best methods are inelastic neutron scattering, which provides a way of measuring the potential energy surface of protons in hydrogen bonds, and x-ray and neutron diffraction. Solid state NMR can also be used.

The best of these is neutron diffraction because the H atoms can be located with similar accuracy to, for instance, C and N atoms. But these are difficult experiments and require using a national facility. X-ray diffraction is routinely available in many labs but the H atoms hardly scatter x-rays as they have so few electrons compared to C or N and so the position of the H atoms usually has to be inferred from the known positions of other atoms. This is not as arbitrary as it might seem as high quality x-ray structures are routinely produced nowadays.