Thursday, 29 January 2015

Leon Green's theorem

The fundamental objects of study in higher-order Fourier analysis are nilmanifolds, or in other words spaces given as a quotient $G/\Gamma$ of a connected nilpotent Lie group $G$ by a discrete cocompact subgroup $\Gamma$. Starting with Furstenberg's work on Szemeredi's theorem and the multiple recurrence theorem, work by Host and Kra, Green and Tao, and several others has gradually established that nilmanifolds control higher-order linear configurations in the same way that the circle, as in the Hardy-Littlewood circle method, controls first-order linear configurations.

Of basic importantance in the study of nilmanifolds is equidistribution: one needs to know when the sequence $g^n x$ equidistributes and when it is trapped inside a subnilmanifold. It turns out that this problem was already studied by Leon Green in the 60s. To describe the theorem note first that the abelianisation map $G\to G/G_2$ induces a map from $G/\Gamma$ to a torus $G/(G_2\Gamma)$ which respects the action of $G$, and recall that equidistribution on tori is well understood by Weyl's criterion. Leon Green's beautiful theorem then states that $g^n x$ equidistributes in the nilmanifold if and only if its image in the torus $G/(G_2\Gamma)$ equidistributes.

Today at our miniseminar Aled Walker showed us Parry's nice proof of this theorem, which is more elementary than Green's original proof. During the talk there was some discussion about the importantance of various hypotheses such as 'simply connected' and 'Lie'. It turns out that the proof works rather generally for connected locally compact nilpotent groups, so I thought I would record the proof here with minimal hypotheses. The meat of the argument is exactly as in Aled's talk and, presumably, Parry's paper.

Let $G$ be an arbitrary locally compact connected nilpotent group, say with lower central series$$G=G_1\geq G_2 \geq\cdots\geq G_s\geq G_{s+1}=1,$$and let $\Gamma\leq G$ be a closed cocompact subgroup. Under these conditions the Haar measure $\mu_G$ of $G$ induces a $G$-invariant probability measure $\mu_{G/\Gamma}$ on $G/\Gamma$. We say that $x_n\in G/\Gamma$ is equidistributed if for every $f\in C(G/\Gamma)$ we have$$\frac{1}{N}\sum_{n=0}^{N-1} f(x_n) \to \int f \,d\mu_{G/\Gamma}.$$We fix our attention on the sequence$$x_n = g^n x$$for some $g\in G$ and $x\in G/\Gamma$. As before we have an abelianisation map$$\pi: G/\Gamma\to G/G_2\Gamma$$from the $G$-space $G/\Gamma$ to the compact abelian group $G/G_2\Gamma$. We define equidistribution on $G/G_2\Gamma$ similarly. The theorem is then the following.

Theorem: (Leon Green) For $g\in G$ and $x\in G/\Gamma$ the following are equivalent.
1. The sequence $g^n x$ is equidistributed in $G/\Gamma$.
2. The sequence $\pi(g^n x)$ is equidistributed in $G/G_2\Gamma$.
3. The orbit of $\pi(g)$ is dense in $G/G_2\Gamma$.
4. $\chi(\pi(g))\neq 0$ for every nontrivial character $\chi:G/G_2\Gamma\to\mathbf{R}/\mathbf{Z}$.

Item 1 above trivially implies every other item. The implication 4$\implies$3 (a generalised Kronecker theorem) follows by pulling back any nontrivial character of $(G/G_2\Gamma)/\overline{\langle\pi(g)\rangle}$. The implication 3$\implies$2 (a generalised Weyl theorem) follows from the observation that every weak* limit point of the sequence of measures$$ \frac{1}{N}\sum_{n=0}^{N-1} \delta_{\pi(g^n x)}$$must be shift-invariant and thus equal to the Haar measure. So the interesting content of the theorem is 2$\implies$1.

A word about the relation to ergodicity: By the ergodic theorem the left shift $\tau_g:x\mapsto gx$ is ergodic if and only if for almost every $x$ the sequence $g^n x$ equidistributes; on the other hand $\tau_g$ is uniquely ergodic, i.e., the only $\tau_g$-invariant measure is the given one, if and only if for every $x$ the sequence $g^n x$ equidistributes. Thus to prove the theorem above we must not only prove that $\tau_g$ is ergodic but that it is uniquely ergodic. Fortunately one can prove these two properties are equivalent in this case.
Lemma: If $\tau_g:G/\Gamma\to G/\Gamma$ is ergodic then it's uniquely ergodic.
Proof: (Furstenberg) By the ergodic theorem the set $A$ of $\mu_{G/\Gamma}$-generic points, in other words points $x$ for which$$\frac{1}{N}\sum_{n=0}^{N-1} f(g^n x) \to \int f \,d\mu_{G/\Gamma}$$for every $f\in C(G/\Gamma)$, has $\mu_{G/\Gamma}$-measure $1$, and clearly if $x\in A$ and $c\in G_s$ then $xc\in A$, so $A = p^{-1}(p(A))$, where $p$ is the projection of $G/\Gamma$ onto $G/G_s\Gamma$.

Now let $\mu'$ be any $\tau_g$-invariant ergodic measure. By induction we may assume that $\tau_g:G/G_s\Gamma\to G/G_s\Gamma$ is uniquely ergodic, so we must have $p_*\mu' = p_*\mu_{G/\Gamma}$, so$$\mu'(A) = p_*\mu'(p(A)) = p_*\mu_{G/\Gamma}(p(A)) = \mu_{G/\Gamma}(A) = 1.$$But by the ergodic theorem the set of $\mu'$-generic points must also have $\mu'$-measure $1$, so there must be some point which is both $\mu_{G/\Gamma}$- and $\mu'$-generic, and this implies that $\mu'=\mu_{G/\Gamma}$.$\square$

We need one more preliminary lemma about topological groups before we really get started on the proof.
Lemma: If $H$ and $K$ are connected subgroups of some ambient topological group then $[H,K]$ is also connected.
Proof: Since $(h,k)\mapsto [h,k]=h^{-1}k^{-1}hk$ is continuous certainly $C = \{[h,k]:h\in H,k\in K\}$ is connected, so $C^n = CC\cdots C$ is also connected, so because $1\in C^n$ for all $n$ we see that $[H,K]=\bigcup_{n=1}^\infty C^n$ is connected.$\square$

Thus if $G$ is connected then every term $G_1,G_2,G_3,\dots$ in the lower central series of $G$ is connected.

Proof of Leon Green's theorem: As noted it suffices to prove that $\tau_g$ acts ergodically on $G/\Gamma$ whenever it acts ergodically on $G/G_2\Gamma$. By induction we may assume that $\tau_g$ acts ergodically on $G/G_s\Gamma$. So suppose that $f\in L^2(G/\Gamma)$ is $\tau_g$-invariant. By decomposing $L^2(G/\Gamma)$ as a $\overline{G_s\Gamma}/\Gamma$-space we may assume that $f$ obeys$$f(cx)=\gamma(c)f(x)\quad(c\in G_s, x\in G/\Gamma)$$for some character $\gamma:G_s\Gamma/\Gamma\to S^1$. In particular $|f|$ is both $G_s$-invariant and $\tau_g$-invariant, so it factors through a $\tau_g$-invariant function $G/G_s\Gamma\to\mathbf{R}$, so it must be constant, say $1$. Moreover for every $b\in G_{s-1}$ the function$$\Delta_bf(x) = f(bx)\overline{f(x)}$$is $G_s$-invariant, and also a $\tau_g$ eigenvector:$$\Delta_bf(gx) = \gamma([b,g])\Delta_bf(x).$$By integrating this equation we find that either $\gamma([b,g])=1$, so $\Delta_bf$ is constant, or $\int\Delta_bf \,d\mu_{G/\Gamma}= 0$, so either way we have$$\int \Delta_bf\,d\mu_{G/\Gamma}\in \{0\}\cup S^1.$$But since $\int\Delta_bf\,d\mu_{G/\Gamma}$ is a continuous function of $b$ and equal to $1$ when $b=1$ we must have $\gamma([b,g])=1$ for all sufficiently small $b$, and thus for all $b$ by connectedness of $G_{s-1}$ and the identity$$[b_1b_2,g]=[b_1,g][b_2,g].$$Thus setting $\gamma(b)=\Delta_bf$ extends $\gamma$ to a homomorphism $G_{s-1}\to S^1$. In fact we can extend $\gamma$ still further to a function $G\to D_1$, where $D_1$ is the unit disc in $\mathbf{C}$, by setting$$\gamma(a) = \int \Delta_af\,d\mu_{G/\Gamma}.$$Now if $a\in G$ and $b\in G_{s-1}$ then$$\gamma(ba) = \int f(bax) \overline{f(x)}\,d\mu_{G/\Gamma} = \int \gamma(b)f(ax)\overline{f(x)}\,d\mu_{G/\Gamma}=\gamma(b)\gamma(a),$$and$$\gamma(ab) = \int f(abx)\overline{f(x)}\,d\mu_{G/\Gamma} = \int f(ax) \overline{f(b^{-1}x)}\,d\mu_{G/\Gamma} = \int f(ax) \overline{\gamma(b^{-1})}\overline{f(x)}\,d\mu_{G/\Gamma} = \gamma(b)\gamma(a),$$so$$\gamma(b)\gamma(a)=\gamma(ab)=\gamma(ba[a,b]) = \gamma(ba)\gamma([a,b])=\gamma(b)\gamma(a)\gamma([a,b]).$$Since $|\gamma(b)|=1$ we can cancel $\gamma(b)$, so$$\gamma(a)(\gamma([a,b])-1) = 0.$$Finally observe that $\gamma(a)$ is a continuous function of $a$, and $\gamma(1)=1$, so we must have $\gamma([a,b])=1$ for all sufficiently small $a$, and thus by connectedness of $G$ and the identity$$[a_1a_2,b]=[a_1,b][a_2,b]$$we must have $\gamma([a,b])=1$ identically. But this implies that $\gamma$ vanishes on all $s$-term commutators and thus on all of $G_s$, so in fact $f$ factors through $G/G_s\Gamma$, so it must be constant.$\square$

A remark is in order about the possibility that some of the groups $G_i$ and $G_i\Gamma$ are not closed. This should not matter. One could either read the above proof as it is written, noting carefully that I never said groups should be Hausdorff, or, what's similar, instead modify it so that whenever you quotient by a group $H$ you instead quotient by the group $\overline{H}$.

Embarrassingly, it's difficult to come up with a non-Lie group to which this generalised Leon Green's theorem applies. It seems that many natural candidates have the property that $G$ is not connected but $G/\Gamma$ is: for example consider$$\left(\begin{array}{ccc}1&\mathbf{R}\times\mathbf{Q}_2&\mathbf{R}\times\mathbf{Q}_2\\0&1&\mathbf{R}\times\mathbf{Q}_2\\0&0&1\end{array}\right)/\left(\begin{array}{ccc}1&\mathbf{Z}[1/2]&\mathbf{Z}[1/2]\\0&1&\mathbf{Z}[1/2]\\0&0&1\end{array}\right).$$So it would be interesting to know whether the theorem extends to such a case. Or perhaps there are no interesting non-Lie groups for this theorem, which would be a bit of a let down.

Wednesday, 19 November 2014

Joseph's conjectures on commuting probability, an ultrafinitary perspective

The commuting probability $\Pr(G)$ of a finite group $G$ is the proportion of pairs $(x,y)\in G^2$ which commute. In 1977 Keith Joseph made three conjectures about the set
$$\mathcal{P} = \{\Pr(G) : G \text{ a finite group}\},$$
namely the following.

Joseph's conjectures: 
J1. All limit points of $\mathcal{P}$ are rational.
J2. $\mathcal{P}$ is well ordered by >.
J3. $\{0\}\cup\mathcal{P}$ is closed.


Earlier this month I uploaded a preprint to the arxiv which proves the first two of these conjectures, and yesterday I gave a talk at the algebra seminar in Oxford about the proof. While preparing the talk I noticed that some aspects of the proof are simpler from an ultrafinitary perspective, basically because ultrafilters can be used to streamline epsilon management, and I gave one indication of this perspective during the talk. In this post I wish to lay out the ultrafinitary approach in greater detail.

Throughout this post we fix a nonprincipal ultrafilter $u\in\beta\mathbf{N}\setminus\mathbf{N}$, and we let $\mathbf{R}^*$ be the ultrapower $\mathbf{R}^\mathbf{N}/u$, where two elements of $\mathbf{R}^\mathbf{N}$ are considered equivalent iff they are equal in $u$-almost-every coordinate. The elements of $\mathbf{R}^*$ are called "nonstandard reals" or "hyperreals". There is a principle at work in nonstandard analysis, possibly called Łoś's theorem, which asserts, without going into the finer details, that all "first-order" things that you do with reals carry over in a natural way to the field of hyperreals, and everything more-or-less works just how you'd like it to. For instance, if $r$ and $s$ are hyperreals then $r<s$ naturally means that the inequality holds in $u$-almost-every coordinate, and similarly the field operations of $\mathbf{R}$ extend naturally, and with these definitions $\mathbf{R}^*$ becomes a totally ordered field. We will seldom spell out so explicitly how to naturally extend first-order properties in this way.

An ultrafinite group $G$ is an ultraproduct $\prod_u G_i = \prod_{i=1}^\infty G_i/u$ of finite groups. Its order $|G| = (|G_i|)/u$ is a nonstandard natural number, and its commuting probability $\Pr(G) = (\Pr(G_i))/u$ is a nonstandard real in the interval $[0,1]$. Joseph's first two conjectures can now be stated together in the following way.
Theorem: The commuting probability of every ultrafinite group $G$ has the form $q+\epsilon$, where $q$ is a standard rational and $\epsilon$ is a nonnegative infinitesimal.
Somewhat similarly, Joseph's third conjecture can be stated in an ultrafinitary way as follows: For every ultrafinite group $G$ there is a finite group $H$ such that $\text{st}\Pr(G)=\Pr(H)$. Here $\text{st}$ is the standard part operation, which maps a finite hyperreal to the nearest real. When phrased in this way it resembles a known result about compact groups. Every compact group $G$ has a unique normalised Haar measure, so we have a naturally defined notion of commuting probability $\Pr(G)$. However every compact group $G$ with $\Pr(G)>0$ has a finite-index abelian subgroup, and with a little more work one can actually find a finite group $H$ with $\Pr(G)=\Pr(H)$. This is a theorem of Hofmann and Russo. Nevertheless, I find Joseph's third conjecture rather hard to believe.

For us the most important theorem about commuting probability is a theorem of Peter Neumann, which states that if $\epsilon>0$ then every finite group $G$ such that $\Pr(G)\geq\epsilon$ has a normal subgroup $H$ such that $|G/H|$ and $|[H,H]|$ are both bounded in terms of $\epsilon$. To prove the above theorem we need the following "amplified" version:

Theorem (Neumann's theorem, amplified, ultrafinitary version): Every ultrafinite group $G$ has an internal normal subgroup $H$ such that $[H,H]$ is finite and such that almost every pair $(x,y)\in G^2$ such that $[x,y]\in[H,H]$ is contained in $H^2$.
Here if $G = \prod_u G_i$ we say that $S\subset G$ is internal if $S$ is itself an ultraproduct $\prod_u S_i$ of subsets $S_i\subset G_i$, and "almost every" needs little clarification because the set of pairs in question is an internal subset of $G^2$. (Otherwise we would need to introduce Loeb measure.)

Proof: If $\text{st}\Pr(G)=0$ the theorem holds with $H=1$, so we may assume $\text{st}\Pr(G)>0$. By Neumann's theorem $G$ has an internal normal subgroup $K_0$ of finite index such that $[K_0,K_0]$ is finite. Since $G/K_0$ is finite there are only finitely many normal subgroups $K\leq G$ containing $K_0$ and each of them is internal, so we may find normal subgroups $K,L\leq G$ containing $K_0$ such that $[K,L]$ is finite, and which are maximal subject to these conditions.

Suppose that a positive proportion of pairs $(x,y)\in G^2$ outside of $K\times L$ satisfied $[x,y]\in[K,L]$. Then we could find $(x,y)\in G^2\setminus (K\times L)$, say with $x\notin K$, such that for a positive proportion of $(k,l)\in K\times L$ we have $[xk,yl]\in[K,L]$. After a little commutator algebra one can show then that for a positive proportion of $l\in L$ we have $[x,l]\in[K,L]$, or in other words that the centraliser
$$ N_0 = C_{L/[K,L]}(x) = C_{L/[K,L]}(\langle K,x\rangle)$$
of $x$ in $L/[K,L]$ has finite index. But this implies that the largest normal subgroup contained in $N_0$, namely
$$ N = C_{L/[K,L]}(K'),$$
where $K'$ is the normal subgroup of $G$ generated by $K$ and $x$, also has finite index. Since certainly $K\leq C_{K'/[K,L]}(L)$ a classical theorem of Baer implies that
$$[K'/[K,L], L/[K,L]] = [K',L]/[K,L]$$
is finite, and hence that $[K',L]$ is finite, but this contradicts the maximality of $K$ and $L$.

Hence almost every pair $(x,y)\in G^2$ such that $[x,y]\in[K,L]$ is contained in $K\times L$, and thus also in $L\times K$, so the theorem holds for $H=K\cap L$.$\square$

Now let $G$ be any ultrafinite group $G$ and let $H$ be as in the theorem. Then
$$\Pr(G) = \frac1{|G/H|^2} \Pr(H) + \epsilon,$$
where $\epsilon$ is nonnegative and infinitesimal. Thus it suffices to show that $\Pr(H)$ has the form (standard rational) + (nonnegative infinitesimal) whenever $[H,H]$ is finite. Note in this case that Hall's theorem implies that the second centre
$$Z_2(H) = \{h\in H : [h,H]\subset Z(H)\}$$
has finite index. One can complete the proof using a little duality theory of abelian groups, but the ultrafinite perspective adds little here so I refer the reader to my paper.

The other thing I noticed while preparing my talk is that the best lower bound I knew for the order type of $\mathcal{P}$, $\omega^2$, is easy to improve to $\omega^\omega$, just by remembering that $\mathcal{P}$ is a subsemigroup of $(0,1]$. In fact the order type of a well ordered subsemigroup of $(0,1]$ is heavily restricted: it's either $0$, $1$, or $\omega^{\omega^\alpha}$ for some ordinal $\alpha$. This observation reduces the possibilities for the order type of $\mathcal{P}$ to $\{\omega^\omega,\omega^{\omega^2}\}$. I have no idea which it is!

Thursday, 13 November 2014

The idempotent theorem

Let $G$ be a locally compact abelian group and let $M(G)$ be the Banach algebra of regular complex Borel measures on $G$. Given $\mu\in M(G)$ its Fourier transform
$$\hat{\mu}(\gamma) = \int \overline{\gamma}\,d\mu,$$
is a continuous function defined on the Pontryagin dual $\hat{G}$ of $G$. If the measure $\mu$ is "nice" in some way then we expect some amount of regularity from the function $\hat{\mu}$. For instance if $\mu$ is actually an element of the subspace $L^1(G)\subset M(G)$ of measures absolutely continuous with respect to the Haar measure of $G$ then the Riemann-Lebesgue lemma asserts $\hat{\mu}\in C_0(\hat{G})$.

The idempotent theorem of Cohen, Helson, and Rudin describes the structure of measures $\mu$ whose Fourier transform $\hat{\mu}$ takes a discrete set of values, or equivalently, since $\|\hat{\mu}\|_\infty\leq\|\mu\|$, a finite set of values. To describe the theorem, note that we can define $P(\mu)$ for any polynomial $P$ by taking appropriate linear combinations of convolution powers of $\mu$, and moreover we have the relation $\widehat{P(\mu)} = P(\hat{\mu})$, where on the right hand side we apply $P$ pointwise. Thus if $\hat{\mu}$ takes only the values $a_1,\dots,a_n$ then by setting
$$P_i(x) = \prod_{j\neq i} (x-a_j)/(a_i-a_j)$$
we obtain a decomposition $\mu = a_1\mu_1 + \cdots + a_n\mu_n$ of $\mu$ into a linear combination of measures $\mu_i=P_i(\mu)$ whose Fourier transforms $\hat{\mu_i} = P_i(\hat{\mu})$ take only values $0$ and $1$. Such measures are called idempotent, because they are equivalently defined by $\mu\ast\mu=\mu$. By the argument just given it suffices to characterise idempotent measures: this explains the name of the theorem.

The most obvious example of an idempotent measure is the Haar measure $m_H$ of a compact subgroup $H\leq G$. Moreover we can multiply any idempotent measure $\mu$ by a character $\gamma\in\hat{G}$ to obtain a measure $\gamma\mu$ defined by
$$\int f \,d(\gamma\mu) = \int f\gamma\,d\mu.$$
This measure $\gamma\mu$ will again be idempotent, as
$$\int f\,d(\gamma \mu\ast\gamma \mu) = \int\int f(x+y)\gamma(x)\gamma(y)\,d\mu(x)d\mu(y) = \int\int f(x+y) \gamma(x+y)\,d\mu(x)d\mu(y) = \int f\gamma\,d\mu.$$
If we add or subtract two idempotent measures then though we may not have again an idempotent measure we certainly have a measure whose Fourier transform takes integer values. On reflection, it feels more natural in the setting of harmonic analysis to require that $\hat{\mu}$ takes values in a certain discrete subgroup than to require that it take values in $\{0,1\}$, so we relax our restriction so. The idempotent theorem states that we have already accounted for all those $\mu$ such that $\hat{\mu}$ is integer-valued.

Theorem (the idempotent theorem): For every $\mu\in M(G)$ such that $\hat{\mu}$ is integer-valued there is a finite collection of compact subgroups $G_1,\dots,G_k\leq G$, characters $\gamma_1,\dots,\gamma_k\in\hat{G}$, and integers $n_1,\dots,n_k\in\mathbf{Z}$ such that
$$\mu = n_1\gamma_1 m_{G_1} + \cdots + n_k\gamma_k m_{G_k}.$$
As a consequence we deduce a structure theorem for $\mu$ with $\hat{\mu}$ taking finitely many values, as we originally wanted: for every such $\mu$ there is a finite collection of compact subgroups $G_1,\dots,G_k\leq G$, characters $\gamma_1,\dots,\gamma_k\in\hat{G}$, and complex numbers $a_1,\dots,a_k\in\mathbf{C}$ such that
$$\mu = a_1\gamma_1 m_{G_1} + \cdots + a_k \gamma_k m_{G_k}.$$

The theorem was first proved in the case of $G=\mathbf{R}/\mathbf{Z}$ by Helson in 1953: in this case the theorem states simply that if $\hat{\mu}$ is integer-valued then it differs from some periodic function in finitely many places. In 1959 Rudin gave the theorem its present form and proved it for $(\mathbf{R}/\mathbf{Z})^d$. Finally in 1960 Cohen proved the general case, in the same paper in which he made the first substantial progress on the Littlewood problem. The proof was subsequently simplified a good deal, particularly by Amemiya and Ito in 1964. We reproduce their proof here.

First note that if $\hat{\mu}$ is integer-valued then $\mu$ is supported on a compact subgroup. Indeed by inner regularity there is a compact set $K$ such that $|\mu|(K^c)<0.1$, the set $U$ of all $\gamma\in\hat{G}$ such that $|1-\gamma|<0.1/\|\mu\|$ on $K$ is then open, and if $\gamma\in U$ then
$$\|\gamma\mu-\mu\| = \int_G |\gamma-1|\,d|\mu| \leq \int_K + \int_{K^c} < 0.1 + 0.1 < 1.$$
But if $\gamma\mu\neq\mu$ then
$$\|\gamma\mu-\mu\|\geq \|\widehat{\gamma\mu}-\hat{\mu}\|_\infty \geq 1,$$
so $\gamma\mu=\mu$ for all $\gamma\in U$. Thus $\Gamma=\{\gamma\in\hat{G}: \gamma\mu=\mu\}$ is an open subgroup of $\hat{G}$, so by Pontryagin duality its preannihilator $\Gamma^\perp = \{g\in G: \gamma(g)=1 \text{ for all }\gamma\in\Gamma\}$ is a compact subgroup of $G$. Clearly $\mu$ is supported on $\Gamma^\perp$. Thus from now on we assume $G$ is compact.

Fix a measure $\mu\in M(G)$ and let $A=\{\gamma\mu: \gamma\in\hat{G}\}$.
Lemma 1: If $\nu$ is a weak* limit point of $A$ then $\|\nu\|<\|\mu\|$.
Proof: Fix $\epsilon>0$ and suppose we could find $f\in C(G)$ such that $\|f\|_\infty\leq 1$ and $\int f\,d\nu > (1-\epsilon)\|\mu\|$. Let $\gamma\mu$ be close enough to $\nu$ that $\Re\int f\gamma\,d\mu > (1-\epsilon)\|\mu\|$. Write $\mu = \theta|\mu|$ and $f\gamma\theta = g + ih$. Then if $Z$ is the complex number $Z = \int (g+i|h|)\,d|\mu|$, then $|Z|\leq\|\mu\|$ and
$$\Re Z = \int g \,d|\mu| = \Re\int f\gamma\,d\mu > (1-\epsilon)\|\mu\|,$$
so we must have
$$\Im Z = \int |h|\,d|\mu| \leq (1-(1-\epsilon)^2)^{1/2}\|\mu\| \leq 2\epsilon^{1/2}\|\mu\|.$$
Thus also
$$\int |1 - f\gamma\theta| \,d|\mu| \leq \int |1 - g|\,d|\mu| + \int |h|\,d|\mu| \leq 3\epsilon^{1/2}\|\mu\|.$$
But if this holds for both $\gamma_1\mu$ and $\gamma_2\mu$, say with $\gamma_1\mu\neq\gamma_2\mu$, then we have
$$ 1\leq \|\gamma_1\mu-\gamma_2\mu\| \leq \int |\gamma_1 - f\gamma_1\gamma_2\theta|\,d|\mu| + \int |\gamma_2 - f\gamma_1\gamma_2\theta|\,d|\mu| \leq 6\epsilon^{1/2}\|\mu\|,$$
so $\epsilon \geq 1/(36\|\mu\|^2)$, so
$$\|\nu\| \leq \|\mu\| - \frac{1}{36\|\mu\|}.\square$$
Lemma 2: If $\nu$ is a weak* limit point of $A$ then $\nu$ is singular with respect to the Haar measure $m_G$ of $G$.
Proof: By the Radon-Nikodym theorem we have a decomposition $\mu = f m_G + \mu_s$ for some $f\in L^1(G)$ and some $\mu_s$ singular with respect to $m_G$. By the Riemann-Lebesgue lemma then $\nu$ is a limit point of $\{\gamma\mu_s:\gamma\in\hat{G}\}$. Thus for any open set $U$ and $f\in C(G)$ such that $\|f\|_\infty\leq 1$ and $f=0$ outside of $U$ we have
$$\left|\int f\,d\nu\right| \leq \sup_\gamma \left|\int f\gamma \,d\mu_s\right| \leq |\mu_s|(U),$$
so $|\nu|(U)\leq |\mu_s|(U)$. This inequality extends to Borel sets in the usual way, so $\nu$ is singular.$\square$

The theorem follows relatively painlessly from the two lemmas. Fix $\mu\in M(G)$ with $\hat{\mu}$ integer-valued and let $A = \{\gamma\mu: \int\gamma\,d\mu\neq 0\}$. Then $\overline{A}$ is weak* compact, so because $\|\cdot\|$ is lower semicontinuous in the weak* topology there is some $\nu\in\overline{A}$ of minimal norm. Since $\int d\nu$ is an integer different from $0$ we must have $\nu\neq 0$. Thus by Lemma 1 the set $\{\gamma\nu: \int\gamma\,d\nu\neq 0\}$ is finite. But this implies that
$$\nu = (n_1 \gamma_1 + \cdots + n_k \gamma_k) m_H\qquad(\ast)$$
for some $n_1,\dots,n_k\in\mathbf{Z}$, $\gamma_1,\dots,\gamma_k\in\hat{G}$, and $H=\{\gamma:\gamma\nu=\nu\}^\perp$ the support group of $\nu$. In particular $\nu$ is absolutely continuous with respect to $m_H$, so because $\nu|_H$ is in the weak* closure of $\{\gamma\mu|_H:\gamma\in\hat{G}\}$ we deduce from Lemma 2 that $\nu|_H = \gamma\mu|_H$ for some $\gamma$.  Thus $\mu|_H$ is a nonzero measure of the form $(\ast)$ and we have an obvious mutually singular decomposition
$$\mu = \mu|_H + (\mu-\mu|_H).$$
Since $\|\mu-\mu|_H\| = \|\mu\| - \|\mu|_H\|\leq\|\mu\|-1$ the theorem follows by induction.