(Upgraded to its own post from here.)
This was an attempted saddle point search in GAMESS trying to find if a transition state exists in the transfer of a proton from one water molecule to another (for formation of hydronium and hydroxide.)
This is the weirdest thing I’ve seen using GAMESS yet. This is not exactly a time simulation, it’s an attempted geometry minimization showing the computer trying different geometries in an attempt to locate a saddle point in the potential energy surface. I’m befuddled a bit by this search type because I’m trying to study a reaction pathway in my own work. Unfortunately, this sort of geometry search is operating in a counter-intuitive fashion and I’m not certain whether or not it’s broken in the program. However, when you see two oxygens fighting over a proton… well… that’s just cool. If the waters are set close enough so that they enjoy a hydrogen bond, the energy surface appears to have no extrema except where the protons are located as two waters. If you back the waters off from one another so that they are out of hydrogen bonding distance and pull the hydrogen out so that it is hydrogen bonding with both oxygens, you get this weird behavior where the proton bounces around until the nuclei are close enough to go back to the hydrogen bonded water configuration. I need to pin the oxygens away from each other, which won’t happen in reality.
Not sure what I think.
This post seems to be a good place to put weird and interesting things. Here is an ab initio computation for the structure of a Guanine Quartet.
The surface is again an equiprobability surface and is colored by electrical potential with green being most negative and red being most positive. This calculation took my meager computer 527 minutes (or 8 and a half hours). The central atom is a coordinating sodium and the purine rings are hydrogen bonded to each other by the dashed lines. The basis set was 3-21G, which is as big as I dare use while still being able to make the nitrogens lie flat when they’re in those purine rings. The structure definitely seems to approach C4h point symmetry, though I could not build the model with that symmetry to start out and simplify the calculation due to the complexity of the structure.
This aggregate has a very wickedly interesting HOMO (highest occupied molecular orbital). Because the system is not quite perfectly relaxed to planarity, the symmetry is a little tiny bit broken and the state is not quite but nearly 4-fold degenerate. If you plot all the (nearly) degenerate orbitals together, a very interesting structure emerges.
I thought this was really cool. You can see the peaks and troughs of the electron waves running around the plane. That’s eight electrons circulating in sort of a pi-bond-like structure. This is the highest energy occupied eigenstate of the complex.
I’ve learned some things about G-quartet recently that have had me going back and revisiting the calculations presented above. In the sets shown above, the geometry never quite reached convergence; the optimization step size was set too big and the geometry steps were oscillating around the minimum without ever reaching it for several hundred steps at a time and many hours –just a function of my inexperience at the time.
As it turns out, the structure of the G-quartet isn’t actually C4h in symmetry, as I note above. It might be C2, or even C1. When I was performing the calculation mentioned above, I was operating under the assumption that the G-quartet is a planar assembly, much like Watson-Crick base pairs. This is not an undue assumption and I’ve found published G-quadruplex NMR structure papers that seem to make similar assumptions –an NMR structure is built of computed correlations where certain features about how these correlations can be expressed are an assumption, like that G-quartets are flat. When I flattened it into a plane and attempted to optimize the geometry, it turned out that the structure has no stationary state that is flat. The calculation simply never converges. The G-quartet, liganded to a monovalent sodium ion or not, appears to prefer a saddle splay configuration. The thing has a shape like a Pringle.
This form of the G-quartet is produced using the Pople 6-31G** basis set, including a single D-function on atoms in the second row and a single P-function on hydrogens. The structure is not hugely different from the structures posted above, but the saddle splay it demonstrates is actually a geometry minimized feature of the structure. The hydrogen bonds holding this structure together are mainly unmarked because they turn out to be on the long side, ~2 Angstrom. (Certain G-quadruplex papers I found report these to be longer still, as much as 3 Angstrom)
Part of why I went back and repeated this calculation was because I discovered that the 3-21G basis set I was using above does not properly represent hydrogen bonding. 3-21G is on the small side and is apparently pretty good for a rough draft. For an object the size of a G-quartet, an appropriate, fully converged geometry optimization with just this small basis set took 239 steps and 451 minutes (7.5 hours). This even with my more recent discovery that direct SCF (calculating integrals as needed without ever storing them) is more efficient than classical SCF (calculating integrals once and then storing them) while parallel computing –recovering the integral from a large storage file is slower than simply computing it again. For big structures, you need to stay as small in the basis set as practical and the bigger you go, the smaller the basis which captures the features of the molecule, the better off you are time wise. 3-21G seemed a good choice until I realized that it was producing a non-planar structure for the G-quartet when I fully converged it.
At around this time, I had just discovered that 3-21G massively shortens hydrogen bonds… so much so, in fact, that you can’t really use it to simulate the chemistry of proton transfer in water. Noting that the G-quartet had this big saddle splay in 3-21G and that the quartet structure is dependent on hydrogen bonding, I had the <sarcasm>ingenious</sarcasm> notion that maybe the splay is a result of bond shortening and that a bigger basis set might capture the complex as flat.
So, I embarked on optimizing the structure with 6-31G and supplemented in a couple polarization functions (for 6-31G**) to try to cover the diffuse region where hydrogen bonding occurs. 6-31G doubles the function density of 3-21G, meaning that it increases the required number of integrals by about 16-fold. This is a ten-fold increase in computation time at minimum. I ended up running the thing overnight several nights in a row. The optimization took about 4 days with a bunch of restarts (I learned something cool about GAMESS restarting which I’ll keep to myself).
Remarkably enough, the saddle splay did not go away. In fact, 6-31G** increased the twisting! It got more splayed by a few degrees. This geometry appears to be not an artifact; I switched the coordinate system from the internal coordinates I was using during optimization to a cartesian external system, recreated the hessian and tried to repeat the optimization on the converged geometry and it refused to optimize any further. I was worried about the internal coordinates because GAMESS kept fretting that they were too interdependent, but switching systems didn’t obliterate the stationary point, suggesting it was common across possible coordinate systems.
On the other hand, the more detailed basis set only helps to give more accurate values. Qualitatively, it still very much resembles the previous work, even the poorly converged work above. Here is the cluster of HOMO orbitals, which are simply a retread of the image shown above but with the bond lengths and angles a little more accurate.
You can still see the electron wavelength as they circulate around the plane. You could actually calculate the approximate momentum from that: the de Broglie wavelength is visually obvious in there.
I’ve still got so much to learn. I’m looking at other more recent basis sets. It may well be that someone has invented an effective core potential basis set which captures the diffuse region better than 6-31G** and manages to thin up the number of functions lost in the core in order to cut the calculation time. 6-31G** is a bear on something like a G-quartet using only a laptop computer.
Here’s is an animated .gif showing the potato chip-like shape of a sodium liganded Guanine quartet.