Thomas R. Nicely
Last revised 0800 GMT 05 February 2011.
Date of first release 16 July 1999.
The content of this document (other than the addendum, which was not part of the submission for publication) is essentially that of the original release, except that information rendered obsolete by subsequent events has been removed or modified, in both the main document and the addendum (this liberty is taken in view of the fact that the paper was never accepted for publication). There may also be differences in formatting, and in minor details and corrections, including updated URLs.
The present study results from the continuation of a project initiated in 1993, with results to 1e14 described in the author's 1995 paper [13]. A detailed description of the general problem, the computational methods employed, and the incidental discovery of the Pentium FDIV flaw may be found there, with additional details given in [14] and [15]; only a brief summary will be included here.
The prime numbers themselves continue to retain most of their secrets, but still less is known about the twin primes. A matter as fundamental as the infinitude of K_2 remains undecided---the famous "twin-primes conjecture," a topic of discussion even in a recent popular motion picture [21]. Nonetheless, Brun [3] proved in 1919 that in any event the sum of the reciprocals,
(1.1) B_2 = (1/3 + 1/5) + (1/5 + 1/7) + (1/11 + 1/13) + (1/17 + 1/19) + ... ,
is convergent, in contrast to the known divergence of the sum of the reciprocals of all the primes. The limit of this sum, styled Brun's sum or Brun's constant, is usually denoted by simply B (although Fröberg [4] used S), but herein the author will use B_2. In this instance, as in a number of others noted in this paper, identifiers have been changed from those in [2] and [13], in anticipation of the need for analogous symbols to be used in the study of prime constellations other than the twins.
The twin-primes conjecture is a consequence of a much stronger result proposed by Hardy and Littlewood [9] in 1923,
(1.2) pi_2(x) ~ L_2(x) = 2c_2*int(1/(ln(t))^2, t, 2, x)
where pi_2(x) represents the count of twin-prime pairs (q, q+2) such that q <= x, and c_2 denotes the "twin-primes constant," computed to 42D by Wrench [24] in 1961,
(1.3) c_2 = 0.66016 18158 46869 57392 78121 10014 55577 84326 23...
The validity of the conjecture (1.2), sometimes titled the Hardy-Littlewood approximation, is central to the estimation of Brun's constant and the error bounds in this paper. The Hardy-Littlewood approximation is itself a special case of the yet more general "prime k-tuples conjecture," also set forth in [9]. See Riesel [16, pp. 60-83] for an illuminating exposition of these concepts.
Although (1.1) is convergent, the monotonically increasing partial sums approach the limit with agonizing slowness; summing the first thousand million reciprocals is still insufficient to bring us within five percent of the estimated value of the limit. However, assuming the validity of the Hardy-Littlewood approximation (1.2), a first order extrapolation was derived by Fröberg [4] and further studied by Brent [2],
(1.4) B_2 = S_2(x) + 4c_2/ln(x) + O(1/(sqrt(x)*ln(x))
with an accelerated rate of convergence O(sqrt(x)) faster than (1.1). Here S_2(x) is the partial sum
(1.5) S_2(x) = sum(1/q + 1/(q+2), q, q <= x) .
of the reciprocals of all the twin-prime pairs (q, q+2) for which q <= x. Note that S_2(x) is written as B(x) in [2] and [13]. The first order extrapolation of S_2(x) to approximate B_2 consists of the first two terms of the right hand side of (1.4); this was indicated as B*(x) in [2] and [13], but we write it here as F_2(x):
(1.6) F_2(x) = S_2(x) + 4c_2/ln(x) .
The final term in (1.4) is the author's conjectured error or remainder term, inspired by Brent's probabilistic analysis [2]. As discussed by Shanks and Wrench [20, p. 298], no effective second-order extrapolation is known.
The reciprocal sums were calculated in both long double precision (64 bit mantissa, 19S) and ultraprecision (53D), the former in hardware using the Intel numeric coprocessors, the latter in software using a modification of a package due largely to Arjen Lenstra. Results were written to a binary file at intervals of 1e10.
As mentioned previously, additional details regarding the computational technique, and the Pentium FDIV affair, are available in the cited references [13, 14, 15], and also at the author's URL.
(3.1) delta_2(x) = L_2(x) - pi_2(x) ;
the partial sums S_2(x) of the reciprocals of the twins; and the first order extrapolations F_2(x) of S_2(x) to the limit, according to (1.6), members of a sequence believed to be converging to Brun's constant B_2. Note that the discrepancy delta_2(x) was written in [2] and [13] as r_3(x); Brent also rounded this value to the nearest integer.
Table 1 includes results previously published for powers of ten to 1e14, in addition to new results from the present study for x = (1.5e14)(5e13)(1.6e15). Expanded versions of Table 1 are available at the author's URL, as are extensive tables of the values of pi(x) recorded in this project, at much finer granularity than those commonly available; or see this site for complete results of the computations.
Table 1. Counts of twin-prime pairs and estimates of Brun's constant. ======================================================================= x pi_2(x) delta_2(x) S_2(x) F_2(x) ======================================================================= 1e01 2 2.84 0.8761904761905 2.0230090113326 1e02 8 5.54 1.3309903657191 1.9043996332901 1e03 35 10.80 1.5180324635596 1.9003053086070 1e04 205 9.21 1.6168935574322 1.9035981912177 1e05 1224 24.71 1.6727995848277 1.9021632918562 1e06 8169 79.03 1.7107769308042 1.9019133533279 1e07 58980 -226.18 1.7383570439173 1.9021882632233 1e08 440312 55.79 1.7588156210680 1.9021679379607 1e09 3424506 802.16 1.7747359576385 1.9021602393210 1e10 27412679 -1262.47 1.7874785027192 1.9021603562335 1e11 224376048 -7183.32 1.7979043109551 1.9021605414226 1e12 1870585220 -25353.18 1.8065924191759 1.9021606304377 1e13 15834664872 -66566.94 1.8139437606846 1.9021605710802 1e14 135780321665 -56770.51 1.8202449681303 1.9021605777833 1.50e14 198466021764 -41424.45 1.8212624999372 1.9021605778478 2.00e14 259858400254 -286596.19 1.8219692563019 1.9021605806674 2.50e14 320314774824 -251822.16 1.8225090097373 1.9021605803406 3.00e14 380041003032 -386165.49 1.8229446574499 1.9021605813179 3.50e14 439169847802 -377896.90 1.8233092906771 1.9021605812270 4.00e14 497794845572 -687458.42 1.8236224494488 1.9021605828234 4.50e14 555983876545 -538028.53 1.8238966154614 1.9021605820874 5.00e14 613790177314 -495402.94 1.8241402488571 1.9021605819011 5.50e14 671255245533 -412495.10 1.8243593389029 1.9021605816063 6.00e14 728412916123 -399030.90 1.8245582810368 1.9021605816028 6.50e14 785290803354 -219998.15 1.8247403931446 1.9021605810260 7.00e14 841912734248 -330271.47 1.8249082431040 1.9021605813540 7.50e14 898298225772 -362599.07 1.8250638545912 1.9021605814440 8.00e14 954464283498 -207253.20 1.8252088524970 1.9021605810407 8.50e14 1010426459312 -534774.22 1.8253445623449 1.9021605818329 9.00e14 1066196920739 -459168.78 1.8254720744001 1.9021605816527 9.50e14 1121787648852 -422043.55 1.8255923015512 1.9021605815759 1.00e15 1177209242304 -750443.32 1.8257060132403 1.9021605822498 1.05e15 1232470313145 -856839.63 1.8258138623433 1.9021605824537 1.10e15 1287579137984 -732612.87 1.8259164099410 1.9021605822159 1.15e15 1342543717345 -879045.52 1.8260141417697 1.9021605824791 1.20e15 1397370335220 -761338.54 1.8261074785719 1.9021605822802 1.25e15 1452065612798 -774787.26 1.8261967900949 1.9021605822983 1.30e15 1506635099560 -762644.45 1.8262824008978 1.9021605822837 1.35e15 1561084329425 -1010551.62 1.8263645987282 1.9021605826562 1.40e15 1615417411648 -785068.05 1.8264436378766 1.9021605823288 1.45e15 1669639673260 -931388.83 1.8265197475581 1.9021605825390 1.50e15 1723754585354 -761213.67 1.8265931311402 1.9021605823084 1.55e15 1777766583441 -839920.90 1.8266639732589 1.9021605824132 1.60e15 1831678961614 -851925.37 1.8267324395006 1.9021605824283 =======================================================================
(4.1) B_2 = 1.90216 05824 +/- 0.00000 00030 .
The error estimate is believed to represent a 99 % confidence level, although this is a subjective opinion of the author for which definitive proof remains lacking. More precisely, the author believes that for at least 99 % of the integers in any sufficiently large set of distinct integers x > 1, Brun's constant B_2 would lie between F_2(x) - E_2(x) and F_2(x) + E_2(x). Here E_2(x) is the error bound function stated in (4.6) below, from which the error bound in (4.1) will be obtained. The error estimate is arrived at as follows.
First, it is assumed that the deviations F_2(x) - B_2 change sign infinitely often; more precisely, given any x_0, however large, there will exist integers x_1 > x_0 and x_2 > x_0 such that F_2(x_1) < B_2 and F_2(x_2) > B_2. We will refer to this as hypothesis [H2]. Note that although [H2] is neither necessary nor sufficient for the Hardy-Littlewood approximation (1.2), it will almost certainly fail (along with (4.1) in its entirety) if Hardy-Littlewood is false. In support of [H2], we note that if our best estimate F_2(1.6e15) is used for B_2, then F_2(x) - B_2 is observed to undergo 872 sign changes over the 160081 data points recorded in the present study, with 677 sign changes occurring beyond 1e15. Brent [2] presented a heuristic argument for a much stronger characterization of the deviations, one which would imply [H2], namely that the values of the scaled deviations
(4.2) P_2(x) = sqrt(x)*ln(x)*[F_2(x) - B_2]
(indicated in [13] by P(x)) are asymptotically normally distributed with mean o(1) and standard deviation sqrt(8c_2)=2.2981. A statistical analysis of P_2(x), using F_2(1.6e15) for B_2 and the 160081 data points from the present computation, indicates a distribution that is only roughly normal, with mean -0.5294, standard deviation 0.7489, and widely distributed sign changes. As in a similar analysis performed by the author in [13], the standard deviation was significantly less than Brent's prediction; however, this statistic is quite sensitive to the value presumed for B_2.
Given [H2], we then look for the maximum computable value of the function
(4.3) N_2(x_1, x_2) = sqrt(x_1)*ln(x_1)*|F_2(x_1) - F_2(x_2)|
where x_1 and x_2 are integers such that x_2 > x_1 > 1. N_2 may be considered as a "normalized" measure of the amplitude of the oscillations in F_2(x), or alternately as a measure of the scaled deviation of a "predictor" value F_2(x_1) from a "terminal approximation" F_2(x_2). In contrast to P_2, N_2 has the virtue that it is independent of the uncertain value of B_2. If N_2 has a global maximum, or even an upper bound, then given [H2] this must also represent an upper bound on the absolute value of P_2, thus producing an unconditional error bound; for any specific value |P_2(x_3)| will be exceeded by N_2(x_3, x_4), where x_4 is chosen so that x_4 > x_3 while F_2(x_3) - B_2 and F_2(x_4) - B_2 are of opposite sign ([H2] implies that there is an infinite sequence of such integers x_4 for any given integer x_3). In practice, we are unable to prove that N_2 has a global maximum, although (1.4) would imply that P_2 is bounded. Indeed, it is computationally impractical even to find the absolute maximum of N_2 over all integers in the range under investigation, 1 < x_1 < x_2 <= 1.6e15; this would involve the comparison of more than 1e30 data pairs (F_2(x_1), F_2(x_2)), a computation which might bring an entire Web of teraflop computers to its knees. What has been determined is that the absolute maximum of N_2 over all of the (more than 1e10 such) data pairs recorded in this study is
(4.4) N_2(1000000, 9000000) = 3.805437 .
However, additional computations reveal an even larger value of N_2 in the vicinity of this point,
(4.5) N_2(872140, 4265159) = 4.14083 ,
where the values in both (4.4) and (4.5) have been rounded up. Although the value 4.14083 is still not established as the sought after upper bound on N_2 (and thus on |P_2|), the numerical evidence (zero exceptions among more than 1e10 data pairs extending over fifteen orders of magnitude) indicates that if any larger values of N_2 (and thus |P_2|) exist, they must be relatively rare. Our intuitive conclusion is that for the great majority (99 percent or more?) of integers x > 1, it will be true that |P_2(x)| < 4.14083, producing the error bound formula E_2(x),
(4.6) |F_2(x) - B_2| < E_2(x) = 4.14083/(sqrt(x)*ln(x)) .
In arriving at our error estimate in (4.1), we are assuming that (4.6) does in fact hold specifically for x = 1.6e15, so that
(4.7) |F_2(1.6e15) - B_2| < E_2(1.6e15) = 4.14083/(sqrt(1.6e15)*ln(1.6e15)) ,
giving
(4.8) |F_2(1.6e15) - B_2| < 0.00000 00029 57 ,
which, with rounding up for good measure, produces the error estimate given in (4.1).
The error bound formula E_2(x) in (4.6) is similar to the one arrived at (using different reasoning) by Brent [2]. Brent obtains the constant 3.5 in place of 4.14083---partly because he used an estimate for B_2 in place of F_2(x_2), partly because he did not consider it important to perform a higher resolution search for a larger magnitude of the scaled deviation, and partly because of the different interpretation he placed on the resulting error bound; Brent regarded the value 3.5 as representing approximately 1.523 of his standard deviations of P_2(x) (that is, 1.523*sqrt(8c_2)), yielding a probability (based on the presumed normal distribution) of approximately 88 % that B_2 would fall within the resulting error bound. By contrast, the computed standard deviation of P_2(x) here is only 0.7489; even given that this is artificially low due to the use of F_2(1.6e15) as B_2, and taking into account the non-zero mean, this implies that the error bound E_2(x) represents a variation of nearly five standard deviations from the mean, additional support for the opinion of a 99 % (or better) confidence level. On the other hand, the error bound given by the author in [13] was specifically stated as representing one computed standard deviation at 1e14, so it is not especially surprising that the present value for B_2 exceeds the previous estimate by slightly more than two standard deviations (as computed at 1e14).
It should be noted that the scaling factor sqrt(x)*ln(x) was chosen for two reasons---firstly, the persuasiveness of Brent's argument in [2]; secondly, analysis of the data shows that the resulting mean absolute value of P_2(x) remains of the same order of magnitude (typically a few tenths) over most intervals of 1e12 from 0 through 1.6e15, as would be expected with a valid scaling factor. However, other scaling factors could be employed, and the author in fact investigated several alternatives. No other yielded superior results; either the resulting error bound was inferior, or a "near-maximum" value of the corresponding function N_2 (meaning one sufficiently large to provide an error bound of equal or better confidence level) could not be located without greatly escalating the computational effort.
In summary, the error bound E_2(x) set forth here is more conservative than that of Brent [2] and much more conservative than that given in the author's previous work [13]. More than ten thousand million pairs of first order extrapolations have agreed with each other within this error bound, without exception. Nonetheless, it carries no unconditional guarantee; overwhelming numerical evidence is occasionally a subtle trap, as the conjecture Li(x) > pi(x) so memorably illustrated. However, it would appear that a more rigorous error bound must await a deeper understanding of the subtleties in the distribution of the twin primes---perhaps the discovery of a "Riemann zeta function" for the twins.
Table 2. Counts of primes and twin-prime pairs at primorial values of x. ================================================================== x x pi(x) pi_2(x) ================================================================== 2# 2 1 0 3# 6 3 2 5# 30 10 5 7# 210 46 15 11# 2310 343 70 13# 30030 3248 468 17# 510510 42331 4636 19# 9699690 646029 57453 23# 223092870 12283531 896062 29# 6469693230 300369796 18463713 31# 200560490130 8028643010 425177757 37# 7420738134810 259488750744 11997649372 41# 304250263527210 9414916809095 385088898632 43# 13082761331670030 362597750396740 13280323588034 ==================================================================