The Foundations of Science: Science and Hypothesis, The Value of Science, Science and Method

CHAPTER XI

Chapter 357,112 wordsPublic domain

THE CALCULUS OF PROBABILITIES

Doubtless it will be astonishing to find here thoughts about the calculus of probabilities. What has it to do with the method of the physical sciences? And yet the questions I shall raise without solving present themselves naturally to the philosopher who is thinking about physics. So far is this the case that in the two preceding chapters I have often been led to use the words 'probability' and 'chance.'

'Predicted facts,' as I have said above, 'can only be probable.' "However solidly founded a prediction may seem to us to be, we are never absolutely sure that experiment will not prove it false. But the probability is often so great that practically we may be satisfied with it." And a little further on I have added: "See what a rôle the belief in simplicity plays in our generalizations. We have verified a simple law in a great number of particular cases; we refuse to admit that this coincidence, so often repeated, can be a mere effect of chance...."

Thus in a multitude of circumstances the physicist is in the same position as the gambler who reckons up his chances. As often as he reasons by induction, he requires more or less consciously the calculus of probabilities, and this is why I am obliged to introduce a parenthesis, and interrupt our study of method in the physical sciences in order to examine a little more closely the value of this calculus, and what confidence it merits.

The very name calculus of probabilities is a paradox. Probability opposed to certainty is what we do not know, and how can we calculate what we do not know? Yet many eminent savants have occupied themselves with this calculus, and it can not be denied that science has drawn therefrom no small advantage.

How can we explain this apparent contradiction?

Has probability been defined? Can it even be defined? And if it can not, how dare we reason about it? The definition, it will be said, is very simple: the probability of an event is the ratio of the number of cases favorable to this event to the total number of possible cases.

A simple example will show how incomplete this definition is. I throw two dice. What is the probability that one of the two at least turns up a six? Each die can turn up in six different ways; the number of possible cases is 6 × 6 = 36; the number of favorable cases is 11; the probability is 11/36.

That is the correct solution. But could I not just as well say: The points which turn up on the two dice can form 6 × 7/2 = 21 different combinations? Among these combinations 6 are favorable; the probability is 6/21.

Now why is the first method of enumerating the possible cases more legitimate than the second? In any case it is not our definition that tells us.

We are therefore obliged to complete this definition by saying: '... to the total number of possible cases provided these cases are equally probable.' So, therefore, we are reduced to defining the probable by the probable.

How can we know that two possible cases are equally probable? Will it be by a convention? If we place at the beginning of each problem an explicit convention, well and good. We shall then have nothing to do but apply the rules of arithmetic and of algebra, and we shall complete our calculation without our result leaving room for doubt. But if we wish to make the slightest application of this result, we must prove our convention was legitimate, and we shall find ourselves in the presence of the very difficulty we thought to escape.

Will it be said that good sense suffices to show us what convention should be adopted? Alas! M. Bertrand has amused himself by discussing the following simple problem: "What is the probability that a chord of a circle may be greater than the side of the inscribed equilateral triangle?" The illustrious geometer successively adopted two conventions which good sense seemed equally to dictate and with one he found 1/2, with the other 1/3.

The conclusion which seems to follow from all this is that the calculus of probabilities is a useless science, and that the obscure instinct which we may call good sense, and to which we are wont to appeal to legitimatize our conventions, must be distrusted.

But neither can we subscribe to this conclusion; we can not do without this obscure instinct. Without it science would be impossible, without it we could neither discover a law nor apply it. Have we the right, for instance, to enunciate Newton's law? Without doubt, numerous observations are in accord with it; but is not this a simple effect of chance? Besides how do we know whether this law, true for so many centuries, will still be true next year? To this objection, you will find nothing to reply, except: 'That is very improbable.'

But grant the law. Thanks to it, I believe myself able to calculate the position of Jupiter a year from now. Have I the right to believe this? Who can tell if a gigantic mass of enormous velocity will not between now and that time pass near the solar system, and produce unforeseen perturbations? Here again the only answer is: 'It is very improbable.'

From this point of view, all the sciences would be only unconscious applications of the calculus of probabilities. To condemn this calculus would be to condemn the whole of science.

I shall dwell lightly on the scientific problems in which the intervention of the calculus of probabilities is more evident. In the forefront of these is the problem of interpolation, in which, knowing a certain number of values of a function, we seek to divine the intermediate values.

I shall likewise mention: the celebrated theory of errors of observation, to which I shall return later; the kinetic theory of gases, a well-known hypothesis, wherein each gaseous molecule is supposed to describe an extremely complicated trajectory, but in which, through the effect of great numbers, the mean phenomena, alone observable, obey the simple laws of Mariotte and Gay-Lussac.

All these theories are based on the laws of great numbers, and the calculus of probabilities would evidently involve them in its ruin. It is true that they have only a particular interest and that, save as far as interpolation is concerned, these are sacrifices to which we might readily be resigned.

But, as I have said above, it would not be only these partial sacrifices that would be in question; it would be the legitimacy of the whole of science that would be challenged.

I quite see that it might be said: "We are ignorant, and yet we must act. For action, we have not time to devote ourselves to an inquiry sufficient to dispel our ignorance. Besides, such an inquiry would demand an infinite time. We must therefore decide without knowing; we are obliged to do so, hit or miss, and we must follow rules without quite believing them. What I know is not that such and such a thing is true, but that the best course for me is to act as if it were true." The calculus of probabilities, and consequently science itself, would thenceforth have merely a practical value.

Unfortunately the difficulty does not thus disappear. A gambler wants to try a _coup_; he asks my advice. If I give it to him, I shall use the calculus of probabilities, but I shall not guarantee success. This is what I shall call _subjective probability_. In this case, we might be content with the explanation of which I have just given a sketch. But suppose that an observer is present at the game, that he notes all its _coups_, and that the game goes on a long time. When he makes a summary of his book, he will find that events have taken place in conformity with the laws of the calculus of probabilities. This is what I shall call _objective probability_, and it is this phenomenon which has to be explained.

There are numerous insurance companies which apply the rules of the calculus of probabilities, and they distribute to their shareholders dividends whose objective reality can not be contested. To invoke our ignorance and the necessity to act does not suffice to explain them.

Thus absolute skepticism is not admissible. We may distrust, but we can not condemn _en bloc_. Discussion is necessary.

I. CLASSIFICATION OF THE PROBLEMS OF PROBABILITY.--In order to classify the problems which present themselves _à propos_ of probabilities, we may look at them from many different points of view, and, first, from the _point of view of generality_. I have said above that probability is the ratio of the number of favorable cases to the number of possible cases. What for want of a better term I call the generality will increase with the number of possible cases. This number may be finite, as, for instance, if we take a throw of the dice in which the number of possible cases is 36. That is the first degree of generality.

But if we ask, for example, what is the probability that a point within a circle is within the inscribed square, there are as many possible cases as there are points in the circle, that is to say, an infinity. This is the second degree of generality. Generality can be pushed further still. We may ask the probability that a function will satisfy a given condition. There are then as many possible cases as one can imagine different functions. This is the third degree of generality, to which we rise, for instance, when we seek to find the most probable law in conformity with a finite number of observations.

We may place ourselves at a point of view wholly different. If we were not ignorant, there would be no probability, there would be room for nothing but certainty. But our ignorance can not be absolute, for then there would no longer be any probability at all, since a little light is necessary to attain even this uncertain science. Thus the problems of probability may be classed according to the greater or less depth of this ignorance.

In mathematics even we may set ourselves problems of probability. What is the probability that the fifth decimal of a logarithm taken at random from a table is a '9'? There is no hesitation in answering that this probability is 1/10; here we possess all the data of the problem. We can calculate our logarithm without recourse to the table, but we do not wish to give ourselves the trouble. This is the first degree of ignorance.

In the physical sciences our ignorance becomes greater. The state of a system at a given instant depends on two things: Its initial state, and the law according to which that state varies. If we know both this law and this initial state, we shall have then only a mathematical problem to solve, and we fall back upon the first degree of ignorance.

But it often happens that we know the law, and do not know the initial state. It may be asked, for instance, what is the present distribution of the minor planets? We know that from all time they have obeyed the laws of Kepler, but we do not know what was their initial distribution.

In the kinetic theory of gases, we assume that the gaseous molecules follow rectilinear trajectories, and obey the laws of impact of elastic bodies. But, as we know nothing of their initial velocities, we know nothing of their present velocities.

The calculus of probabilities only enables us to predict the mean phenomena which will result from the combination of these velocities. This is the second degree of ignorance.

Finally it is possible that not only the initial conditions but the laws themselves are unknown. We then reach the third degree of ignorance and in general we can no longer affirm anything at all as to the probability of a phenomenon.

It often happens that instead of trying to guess an event, by means of a more or less imperfect knowledge of the law, the events may be known and we want to find the law; or that instead of deducing effects from causes, we wish to deduce the causes from the effects. These are the problems called _probability of causes_, the most interesting from the point of view of their scientific applications.

I play écarté with a gentleman I know to be perfectly honest. He is about to deal. What is the probability of his turning up the king? It is 1/8. This is a problem of the probability of effects.

I play with a gentleman whom I do not know. He has dealt ten times, and he has turned up the king six times. What is the probability that he is a sharper? This is a problem in the probability of causes.

It may be said that this is the essential problem of the experimental method. I have observed _n_ values of _x_ and the corresponding values of _y_. I have found that the ratio of the latter to the former is practically constant. There is the event, what is the cause?

Is it probable that there is a general law according to which _y_ would be proportional to _x_, and that the small divergencies are due to errors of observation? This is a type of question that one is ever asking, and which we unconsciously solve whenever we are engaged in scientific work.

I am now going to pass in review these different categories of problems, discussing in succession what I have called above subjective and objective probability.

II. PROBABILITY IN MATHEMATICS.--The impossibility of squaring the circle has been proved since 1882; but even before that date all geometers considered that impossibility as so 'probable,' that the Academy of Sciences rejected without examination the alas! too numerous memoirs on this subject, that some unhappy madmen sent in every year.

Was the Academy wrong? Evidently not, and it knew well that in acting thus it did not run the least risk of stifling a discovery of moment. The Academy could not have proved that it was right; but it knew quite well that its instinct was not mistaken. If you had asked the Academicians, they would have answered: "We have compared the probability that an unknown savant should have found out what has been vainly sought for so long, with the probability that there is one madman the more on the earth; the second appears to us the greater." These are very good reasons, but there is nothing mathematical about them; they are purely psychological.

And if you had pressed them further they would have added: "Why do you suppose a particular value of a transcendental function to be an algebraic number; and if [pi] were a root of an algebraic equation, why do you suppose this root to be a period of the function sin 2_x_, and not the same about the other roots of this same equation?" To sum up, they would have invoked the principle of sufficient reason in its vaguest form.

But what could they deduce from it? At most a rule of conduct for the employment of their time, more usefully spent at their ordinary work than in reading a lucubration that inspired in them a legitimate distrust. But what I call above objective probability has nothing in common with this first problem.

It is otherwise with the second problem.

Consider the first 10,000 logarithms that we find in a table. Among these 10,000 logarithms I take one at random. What is the probability that its third decimal is an even number? You will not hesitate to answer 1/2; and in fact if you pick out in a table the third decimals of these 10,000 numbers, you will find nearly as many even digits as odd.

Or if you prefer, let us write 10,000 numbers corresponding to our 10,000 logarithms, each of these numbers being +1 if the third decimal of the corresponding logarithm is even, and -1 if odd. Then take the mean of these 10,000 numbers.

I do not hesitate to say that the mean of these 10,000 numbers is probably 0, and if I were actually to calculate it I should verify that it is extremely small.

But even this verification is needless. I might have rigorously proved that this mean is less than 0.003. To prove this result, I should have had to make a rather long calculation for which there is no room here, and for which I confine myself to citing an article I published in the _Revue générale des Sciences_, April 15, 1899. The only point to which I wish to call attention is the following: in this calculation, I should have needed only to rest my case on two facts, to wit, that the first and second derivatives of the logarithm remain, in the interval considered, between certain limits.

Hence this important consequence that the property is true not only of the logarithm, but of any continuous function whatever, since the derivatives of every continuous function are limited.

If I was certain beforehand of the result, it is first, because I had often observed analogous facts for other continuous functions; and next, because I made in my mind, in a more or less unconscious and imperfect manner, the reasoning which led me to the preceding inequalities, just as a skilled calculator before finishing his multiplication takes into account what it should come to approximately.

And besides, since what I call my intuition was only an incomplete summary of a piece of true reasoning, it is clear why observation has confirmed my predictions, and why the objective probability has been in agreement with the subjective probability.

As a third example I shall choose the following problem: A number _u_ is taken at random, and _n_ is a given very large integer. What is the probable value of sin _nu_? This problem has no meaning by itself. To give it one a convention is needed. We _shall agree_ that the probability for the number _u_ to lie between _a_ and _a_+ is equal to [phi](_a_)_da_; that it is therefore proportional to the infinitely small interval _da_, and equal to this multiplied by _a_ function [phi](_a_) depending only on _a_. As for this function, I choose it arbitrarily, but I must assume it to be continuous. The value of sin _nu_ remaining the same when _u_ increases by 2[pi], I may without loss of generality assume that _u_ lies between 0 and 2[pi], and I shall thus be led to suppose that [phi](_a_) is a periodic function whose period is 2[pi].

The probable value sought is readily expressed by a simple integral, and it is easy to show that this integral is less than

2[pi]M_{_k_}/_n_^{_k_},

M_{_k_} being the maximum value of the _k_th derivative of [phi](_u_). We see then that if the _k_th derivative is finite, our probable value will tend toward 0 when _n_ increases indefinitely, and that more rapidly than 1/_n_^{_k_ - 1}.

The probable value of sin _nu_ when _n_ is very large is therefore naught. To define this value I required a convention; but the result remains the same _whatever that convention may be_. I have imposed upon myself only slight restrictions in assuming that the function [phi](_a_) is continuous and periodic, and these hypotheses are so natural that we may ask ourselves how they can be escaped.

Examination of the three preceding examples, so different in all respects, has already given us a glimpse, on the one hand, of the rôle of what philosophers call the principle of sufficient reason, and, on the other hand, of the importance of the fact that certain properties are common to all continuous functions. The study of probability in the physical sciences will lead us to the same result.

III. PROBABILITY IN THE PHYSICAL SCIENCES.--We come now to the problems connected with what I have called the second degree of ignorance, those, namely, in which we know the law, but do not know the initial state of the system. I could multiply examples, but will take only one. What is the probable present distribution of the minor planets on the zodiac?

We know they obey the laws of Kepler. We may even, without at all changing the nature of the problem, suppose that their orbits are all circular, and situated in the same plane, and that we know this plane. On the other hand, we are in absolute ignorance as to what was their initial distribution. However, we do not hesitate to affirm that their distribution is now nearly uniform. Why?

Let _b_ be the longitude of a minor planet in the initial epoch, that is to say, the epoch zero. Let _a_ be its mean motion. Its longitude at the present epoch, that is to say at the epoch _t_, will be _at_ + _b_. To say that the present distribution is uniform is to say that the mean value of the sines and cosines of multiples of _at_ + _b_ is zero. Why do we assert this?

Let us represent each minor planet by a point in a plane, to wit, by a point whose coordinates are precisely _a_ and _b_. All these representative points will be contained in a certain region of the plane, but as they are very numerous this region will appear dotted with points. We know nothing else about the distribution of these points.

What do we do when we wish to apply the calculus of probabilities to such a question? What is the probability that one or more representative points may be found in a certain portion of the plane? In our ignorance, we are reduced to making an arbitrary hypothesis. To explain the nature of this hypothesis, allow me to use, in lieu of a mathematical formula, a crude but concrete image. Let us suppose that over the surface of our plane has been spread an imaginary substance, whose density is variable, but varies continuously. We shall then agree to say that the probable number of representative points to be found on a portion of the plane is proportional to the quantity of fictitious matter found there. If we have then two regions of the plane of the same extent, the probabilities that a representative point of one of our minor planets is found in one or the other of these regions will be to one another as the mean densities of the fictitious matter in the one and the other region.

Here then are two distributions, one real, in which the representative points are very numerous, very close together, but discrete like the molecules of matter in the atomic hypothesis; the other remote from reality, in which our representative points are replaced by continuous fictitious matter. We know that the latter can not be real, but our ignorance forces us to adopt it.

If again we had some idea of the real distribution of the representative points, we could arrange it so that in a region of some extent the density of this imaginary continuous matter would be nearly proportional to the number of the representative points, or, if you wish, to the number of atoms which are contained in that region. Even that is impossible, and our ignorance is so great that we are forced to choose arbitrarily the function which defines the density of our imaginary matter. Only we shall be forced to a hypothesis from which we can hardly get away, we shall suppose that this function is continuous. That is sufficient, as we shall see, to enable us to reach a conclusion.

What is at the instant _t_ the probable distribution of the minor planets? Or rather what is the probable value of the sine of the longitude at the instant _t_, that is to say of sin (_at_ + _b_)? We made at the outset an arbitrary convention, but if we adopt it, this probable value is entirely defined. Divide the plane into elements of surface. Consider the value of sin (_at_ + _b_) at the center of each of these elements; multiply this value by the surface of the element, and by the corresponding density of the imaginary matter. Take then the sum for all the elements of the plane. This sum, by definition, will be the probable mean value we seek, which will thus be expressed by a double integral. It may be thought at first that this mean value depends on the choice of the function which defines the density of the imaginary matter, and that, as this function [phi] is arbitrary, we can, according to the arbitrary choice which we make, obtain any mean value. This is not so.

A simple calculation shows that our double integral decreases very rapidly when _t_ increases. Thus I could not quite tell what hypothesis to make as to the probability of this or that initial distribution; but whatever the hypothesis made, the result will be the same, and this gets me out of my difficulty.

Whatever be the function [phi], the mean value tends toward zero as _t_ increases, and as the minor planets have certainly accomplished a very great number of revolutions, I may assert that this mean value is very small.

I may choose [phi] as I wish, save always one restriction: this function must be continuous; and, in fact, from the point of view of subjective probability, the choice of a discontinuous function would have been unreasonable. For instance, what reason could I have for supposing that the initial longitude might be exactly 0°, but that it could not lie between 0° and 1°?

But the difficulty reappears if we take the point of view of objective probability, if we pass from our imaginary distribution in which the fictitious matter was supposed continuous to the real distribution in which our representative points form, as it were, discrete atoms.

The mean value of sin (_at_ + _b_) will be represented quite simply by

(1/_n_){[Sigma] sin (_at_ + _b_)},

_n_ being the number of minor planets. In lieu of a double integral referring to a continuous function, we shall have a sum of discrete terms. And yet no one will seriously doubt that this mean value is practically very small.

Our representative points being very close together, our discrete sum will in general differ very little from an integral.

An integral is the limit toward which a sum of terms tends when the number of these terms is indefinitely increased. If the terms are very numerous, the sum will differ very little from its limit, that is to say from the integral, and what I said of this latter will still be true of the sum itself.

Nevertheless, there are exceptions. If, for instance, for all the minor planets,

_b_ = [pi]/2 - _at_,

the longitude for all the planets at the time t would be [pi]/2, and the mean value would evidently be equal to unity. For this to be the case, it would be necessary that at the epoch 0, the minor planets must have all been lying on a spiral of peculiar form, with its spires very close together. Every one will admit that such an initial distribution is extremely improbable (and, even supposing it realized, the distribution would not be uniform at the present time, for example, on January 1, 1913, but it would become so a few years later).

Why then do we think this initial distribution improbable? This must be explained, because if we had no reason for rejecting as improbable this absurd hypothesis everything would break down, and we could no longer make any affirmation about the probability of this or that present distribution.

Once more we shall invoke the principle of sufficient reason to which we must always recur. We might admit that at the beginning the planets were distributed almost in a straight line. We might admit that they were irregularly distributed. But it seems to us that there is no sufficient reason for the unknown cause that gave them birth to have acted along a curve so regular and yet so complicated, which would appear to have been expressly chosen so that the present distribution would not be uniform.

IV. ROUGE ET NOIR.--The questions raised by games of chance, such as roulette, are, fundamentally, entirely analogous to those we have just treated. For example, a wheel is partitioned into a great number of equal subdivisions, alternately red and black. A needle is whirled with force, and after having made a great number of revolutions, it stops before one of these subdivisions. The probability that this division is red is evidently 1/2. The needle describes an angle [theta], including several complete revolutions. I do not know what is the probability that the needle may be whirled with a force such that this angle should lie between [theta] and [theta]+_d_[theta]; but I can make a convention. I can suppose that this probability is [phi]([theta])_d_[theta]. As for the function [phi]([theta]), I can choose it in an entirely arbitrary manner. There is nothing that can guide me in my choice, but I am naturally led to suppose this function continuous.

Let [epsilon] be the length (measured on the circumference of radius 1) of each red and black subdivision. We have to calculate the integral of [phi]([theta])_d_[theta], extending it, on the one hand, to all the red divisions and, on the other hand, to all the black divisions, and to compare the results.

Consider an interval 2[epsilon], comprising a red division and a black division which follows it. Let M and _m_ be the greatest and least values of the function [phi]([theta]) in this interval. The integral extended to the red divisions will be smaller than [Sigma]M[epsilon]; the integral extended to the black divisions will be greater than [Sigma]_m_[epsilon]; the difference will therefore be less than [Sigma](M - _m_)[epsilon]. But, if the function [theta] is supposed continuous; if, besides, the interval [epsilon] is very small with respect to the total angle described by the needle, the difference M - _m_ will be very small. The difference of the two integrals will therefore be very small, and the probability will be very nearly 1/2.

We see that without knowing anything of the function [theta], I must act as if the probability were 1/2. We understand, on the other hand, why, if, placing myself at the objective point of view, I observe a certain number of coups, observation will give me about as many black coups as red.

All players know this objective law; but it leads them into a remarkable error, which has been often exposed, but into which they always fall again. When the red has won, for instance, six times running, they bet on the black, thinking they are playing a safe game; because, say they, it is very rare that red wins seven times running.

In reality their probability of winning remains 1/2. Observation shows, it is true, that series of seven consecutive reds are very rare, but series of six reds followed by a black are just as rare.

They have noticed the rarity of the series of seven reds; if they have not remarked the rarity of six reds and a black, it is only because such series strike the attention less.

V. THE PROBABILITY OF CAUSES.--We now come to the problems of the probability of causes, the most important from the point of view of scientific applications. Two stars, for instance, are very close together on the celestial sphere. Is this apparent contiguity a mere effect of chance? Are these stars, although on almost the same visual ray, situated at very different distances from the earth, and consequently very far from one another? Or, perhaps, does the apparent correspond to a real contiguity? This is a problem on the probability of causes.

I recall first that at the outset of all problems of the probability of effects that have hitherto occupied us, we have always had to make a convention, more or less justified. And if in most cases the result was, in a certain measure, independent of this convention, this was only because of certain hypotheses which permitted us to reject _a priori_ discontinuous functions, for example, or certain absurd conventions.

We shall find something analogous when we deal with the probability of causes. An effect may be produced by the cause _A_ or by the cause _B_. The effect has just been observed. We ask the probability that it is due to the cause _A_. This is an _a posteriori_ probability of cause. But I could not calculate it, if a convention more or less justified did not tell me _in advance_ what is the _a priori_ probability for the cause _A_ to come into play; I mean the probability of this event for some one who had not observed the effect.

The better to explain myself I go back to the example of the game of écarté mentioned above. My adversary deals for the first time and he turns up a king. What is the probability that he is a sharper? The formulas ordinarily taught give 8/9, a result evidently rather surprising. If we look at it closer, we see that the calculation is made as if, _before sitting down at the table_, I had considered that there was one chance in two that my adversary was not honest. An absurd hypothesis, because in that case I should have certainly not played with him, and this explains the absurdity of the conclusion.

The convention about the _a priori_ probability was unjustified, and that is why the calculation of the _a posteriori_ probability led me to an inadmissible result. We see the importance of this preliminary convention. I shall even add that if none were made, the problem of the _a posteriori_ probability would have no meaning. It must always be made either explicitly or tacitly.

Pass to an example of a more scientific character. I wish to determine an experimental law. This law, when I know it, can be represented by a curve. I make a certain number of isolated observations; each of these will be represented by a point. When I have obtained these different points, I draw a curve between them, striving to pass as near to them as possible and yet preserve for my curve a regular form, without angular points, or inflections too accentuated, or brusque variation of the radius of curvature. This curve will represent for me the probable law, and I assume not only that it will tell me the values of the function intermediate between those which have been observed, but also that it will give me the observed values themselves more exactly than direct observation. This is why I make it pass near the points, and not through the points themselves.

Here is a problem in the probability of causes. The effects are the measurements I have recorded; they depend on a combination of two causes: the true law of the phenomenon and the errors of observation. Knowing the effects, we have to seek the probability that the phenomenon obeys this law or that, and that the observations have been affected by this or that error. The most probable law then corresponds to the curve traced, and the most probable error of an observation is represented by the distance of the corresponding point from this curve.

But the problem would have no meaning if, before any observation, I had not fashioned an _a priori_ idea of the probability of this or that law, and of the chances of error to which I am exposed.

If my instruments are good (and that I knew before making the observations), I shall not permit my curve to depart much from the points which represent the rough measurements. If they are bad, I may go a little further away from them in order to obtain a less sinuous curve; I shall sacrifice more to regularity.

Why then is it that I seek to trace a curve without sinuosities? It is because I consider _a priori_ a law represented by a continuous function (or by a function whose derivatives of high order are small), as more probable than a law not satisfying these conditions. Without this belief, the problem of which we speak would have no meaning; interpolation would be impossible; no law could be deduced from a finite number of observations; science would not exist.

Fifty years ago physicists considered, other things being equal, a simple law as more probable than a complicated law. They even invoked this principle in favor of Mariotte's law as against the experiments of Regnault. To-day they have repudiated this belief; and yet, how many times are they compelled to act as though they still held it! However that may be, what remains of this tendency is the belief in continuity, and we have just seen that if this belief were to disappear in its turn, experimental science would become impossible.

VI. THE THEORY OF ERRORS.--We are thus led to speak of the theory of errors, which is directly connected with the problem of the probability of causes. Here again we find _effects_, to wit, a certain number of discordant observations, and we seek to divine the _causes_, which are, on the one hand, the real value of the quantity to be measured; on the other hand, the error made in each isolated observation. It is necessary to calculate what is _a posteriori_ the probable magnitude of each error, and consequently the probable value of the quantity to be measured.

But as I have just explained, we should not know how to undertake this calculation if we did not admit _a priori_, that is to say, before all observation, a law of probability of errors. Is there a law of errors?

The law of errors admitted by all calculators is Gauss's law, which is represented by a certain transcendental curve known under the name of 'the bell.'

But first it is proper to recall the classic distinction between systematic and accidental errors. If we measure a length with too long a meter, we shall always find too small a number, and it will be of no use to measure several times; this is a systematic error. If we measure with an accurate meter, we may, however, make a mistake; but we go wrong, now too much, now too little, and when we take the mean of a great number of measurements, the error will tend to grow small. These are accidental errors.

It is evident from the first that systematic errors can not satisfy Gauss's law; but do the accidental errors satisfy it? A great number of demonstrations have been attempted; almost all are crude paralogisms. Nevertheless, we may demonstrate Gauss's law by starting from the following hypotheses: the error committed is the result of a great number of partial and independent errors; each of the partial errors is very little and besides, obeys any law of probability, provided that the probability of a positive error is the same as that of an equal negative error. It is evident that these conditions will be often but not always fulfilled, and we may reserve the name of accidental for errors which satisfy them.

We see that the method of least squares is not legitimate in every case; in general the physicists are more distrustful of it than the astronomers. This is, no doubt, because the latter, besides the systematic errors to which they and the physicists are subject alike, have to control with an extremely important source of error which is wholly accidental; I mean atmospheric undulations. So it is very curious to hear a physicist discuss with an astronomer about a method of observation. The physicist, persuaded that one good measurement is worth more than many bad ones, is before all concerned with eliminating by dint of precautions the least systematic errors, and the astronomer says to him: 'But thus you can observe only a small number of stars; the accidental errors will not disappear.'

What should we conclude? Must we continue to use the method of least squares? We must distinguish. We have eliminated all the systematic errors we could suspect; we know well there are still others, but we can not detect them; yet it is necessary to make up our mind and adopt a definitive value which will be regarded as the probable value; and for that it is evident the best thing to do is to apply Gauss's method. We have only applied a practical rule referring to subjective probability. There is nothing more to be said.

But we wish to go farther and affirm that not only is the probable value so much, but that the probable error in the result is so much. _This is absolutely illegitimate_; it would be true only if we were sure that all the systematic errors were eliminated, and of that we know absolutely nothing. We have two series of observations; by applying the rule of least squares, we find that the probable error in the first series is twice as small as in the second. The second series may, however, be better than the first, because the first perhaps is affected by a large systematic error. All we can say is that the first series is _probably_ better than the second, since its accidental error is smaller, and we have no reason to affirm that the systematic error is greater for one of the series than for the other, our ignorance on this point being absolute.

VII. CONCLUSIONS.--In the lines which precede, I have set many problems without solving any of them. Yet I do not regret having written them, because they will perhaps invite the reader to reflect on these delicate questions.

However that may be, there are certain points which seem well established. To undertake any calculation of probability, and even for that calculation to have any meaning, it is necessary to admit, as point of departure, a hypothesis or convention which has always something arbitrary about it. In the choice of this convention, we can be guided only by the principle of sufficient reason. Unfortunately this principle is very vague and very elastic, and in the cursory examination we have just made, we have seen it take many different forms. The form under which we have met it most often is the belief in continuity, a belief which it would be difficult to justify by apodeictic reasoning, but without which all science would be impossible. Finally the problems to which the calculus of probabilities may be applied with profit are those in which the result is independent of the hypothesis made at the outset, provided only that this hypothesis satisfies the condition of continuity.