# Applying Trotter gate to MPS or MPO

+1 vote
edited

(Edited Feb 11, 2019)

Hello. I know that there is already a formula page about how to apply Trotter gates to an MPS. However, there are some technical details that I do not understand completely, and I also want to learn about its generalization to the MPO case.

1. Before applying the 2-site gates, it is instructed that the gauge position should be shifted to one of the 2 sites on which the gate is applying. I suspect that this is to make the truncation error as small as possible when doing SVD, but I am not sure. By the way, is there some textbooks or articles on this issue?

2. Finally, is there any previous work about applying multi-site (such as 3- or 4-site) gates to MPS or MPO? For example, I may want to simulate the time evolution of 2D Ising model with nearest neighbor interaction (this will lead to 4-site Trotter gates). I am aware that I may relabel the sites (connecting the sites using a string) to convert it to a 1D system and make use of swap gates.

selected

Hi Steve,

1) That is correct, putting the gauge position where the gate is being applied is meant to help minimize the truncation error. Basically, using that gauge center allows one to treat the local tensors (the contraction of the MPS tensors with the gate you are applying) as the square root of the density matrix on those sites, and therefore the best low-rank approximation is truncating according to the singular values. An early reference on this is:

https://arxiv.org/abs/cond-mat/0605597

Note that this is essentially the same issue that comes up in 2-site DMRG, that it is best to perform DMRG when the gauge center of the wavefunction is located on the sites you are optimizing, which makes the eigenvector equation simpler and the truncation of the bond dimension using the singular value decomposition optimal.

2) Time evolution of 2D systems is definitely a a tricky subject. There are a few methods available for doing it. As you mention, one approach is to make use of swap gates to move the trotter gates so they become local for the MPS. Another approach is to turn the non-local gates into an MPO. There are also other approaches available for performing time evolution with non-local Hamiltonians, like the time dependent variation principle (TDVP). Here are some references with discussions about these approaches (these are just some that come to mind, but it is a large literature at this point):

Note that time evolving MPOs can be even more subtle. There is a problem that naively truncating the MPO with an SVD, as one would do with an MPS, can cause the MPO to possibly become non-positive. Positivity is a property one wants to preserve if the MPO represents a density matrix, and introducing negative eigenvalues can lead to numerical instability in the evolution. Here is a paper discussing that issue, which is addressed with a new technique called density matrix truncation (DMT):

https://arxiv.org/abs/1707.01506

If you have questions about implementing these methods in ITensor, please don't hesitate to ask. It is difficult to give more guidance without more specifics about the calculations you want to perform.

Cheers,
Matt

commented by (260 points)
edited
Thank you for sharing the references! In fact, I have always been worrying that using a chain to relabel the spins on 2D lattice may destroy some properties of the system. For example, there may be some entanglement between two adjacent rows, which (in my opinion) is hard to express by a 1D MPS. Is there any known example of 2D system that MUST be treated by PEPS and can cannot be converted to MPS?

About the DMT, I think it may not suit my present need: I am trying to time-evolve the string operator in 2D Toric Code System on square lattice when magnetic field is slowly turned on. (Maybe I am putting too much details here) As you know, the string operator is the product of a bunch of Sx. I think that it can be represented by tensors of virtual dimension (left 1, right 1) and physical dimension (up 2, down 2). I put the Sx matrix on the string, and identity matrix elsewhere, and simply connect them by 1-dimensional virtual links (currently implemented by Python). Then the tensor element of site j will explode as the system size increases if I set the gauge center there, making it impossible to use the DMT algorithm.

You also mentioned that using the gauge center allows one to treat the local tensors as the square root of the density matrix on those sites. So for 4-site gates (the star/vertex operator and plaque the operator in Toric Code), it seems that I should break it apart by 4 to 2+2 to 1+1+1+1, instead of 4 to 1+3 to 1+1+2 to 1+1+1+1 ?
commented by (14.1k points)
It is absolutely true that an MPS does not efficiently represent the entanglement structure of a 2D system. To represent a 2D state with an area law using an MPS, the bond dimension of the MPS will scale exponentially with the width of the system (if we are thinking about the 2D system as a finite strip or cylinder). However, MPS are often used instead of PEPS because MPS methods are very reliable and well developed, and can get good results for small 2D systems (cylinders with widths of 10-20). So to answer your first question, every 2D system of large enough size must be represented in some other way besides an MPS. Here is a paper discussing a comparison of the scaling of MPS vs PEPS in more detail:

https://arxiv.org/abs/1705.03222

For your use case, where you are performing time evolution, time evolution with PEPS is not a very developed area (the only work I know of is this very recent one: https://arxiv.org/abs/1811.05497 ), so the most straightforward way is probably to try out MPS methods first.

You say that you would like to time evolve the string operator. May I ask why you are trying to do this? Will you eventually apply the (time evolved) string operator to the state? If so, it sounds like maybe you could time evolve the state, apply the original string operator, and then time evolve the resulting state after the string operator is applied (i.e. use the Schrodinger picture of evolution instead of the Heisenberg picture). This should be much easier to accomplish in terms of MPS.

For your last question, I have to admit that I don't quite understand your notation. What do you mean by "break it apart by 4 to 2+2 to 1+1+1+1"? Anyway, to figure out the details of how to do time evolution with four-site gates of the toric code, it would depend on what evolution method you were interested in using (swap gates, long range MPOs, TDVP, etc.). I believe that in ITensor, using long range MPOs would be the easiest thing to try out first, so if you have any questions about that please let us know.

-Matt
commented by (260 points)
edited
Thanks for the quick reply! When I try the MPO evolution, I discover that iTensor does not support the exponentiation of the Toric Code Hamiltonian, which includes 4-site operators. I also tried to construct manually the @@W_{I}@@ used in PRB 91, 165112, but it keeps only "non-overlapping terms": this means that lots of the vertices and plaquettes (which have significant overlap with each other after relabelling the sites to a 1D chain) have to be discarded when constructing the evolution MPO. So I think the MPO method may be unreliable. By the way, is the toExpH function in iTensor using this @@W_{I}@@ (as suggested by the name toExpH_ZW1 in the source code), or the more complex @@W_{II}@@?

So I have to use the Trotter + swap gate method. It seems that iTensor only provide functions dealing with 2-site gates, so I am implementing the 4-site gates in Python. However, according to arXiv-1705.05578 this method is "restricted to 2-body terms", so the 4-site gate is now only a naive trial.

(It seems to me that Trotter + swap gate is the synonym of TEBD, and the evolution gate is obtained by exp(x)=1+x+x^2/2+...=~ ((x/3 + 1) * x/2 + 1) * x + 1. Am I understanding it correctly?)

About the 4-site SVD: after applying the 4-site gate, I will obtain a large tensor and I need to restore it to 4 local tensors. You said "using that gauge center allows one to treat the local tensors ... as the square root of the density matrix on those sites", so in the 2-site case, breaking the large tensor apart is optimal. But this may not be true in the 4-site case: I mean I don't know whether I should go as :

Large 4-site tensor -> two 2-site tensor -> four 1-site tensor
Or
Large 4-site tensor -> one 1-site tensor and one 3-site tensor -> two 1-site tensor and one 2-site tensor -> four 1-site tensor
Or neither of them is really correct because of the gauging problem.