Math Wiki

Statistics, in the modern sense of the word, began evolving in response to the novel needs of sovereign states. The evolution of statistics was, in particular, intimately connected with the development of nation states and with the development of probability theory, which put statistics on a firm theoretical basis.

In early times, the meaning was restricted to information about states, particularly demographics such as population. This was later extended to include all collections of information of all types, and later still it was extended to include the analysis and interpretation of such data. In modern terms, "statistics" means both sets of collected information, as in national accounts and temperature record, and analytical work which requires statistical inference. Statistical activities are often associated with models expressed using probabilities, hence the connection with probability theory. The large requirements of data processing have made statistics a key application of computing. A number of statistical concepts have an important impact on a wide range of sciences. These include the design of experiments and approaches to statistical inference such as Bayesian inference, each of which can be considered to have their own sequence in the development of the ideas underlying modern statistics.

Origins in probability theory[]

Basic forms of statistics have been used since the beginning of civilization. Early empires often collated censuses of the population or recorded the trade in various commodities. The Han Dynasty and the Roman Empire were some of the first states to extensively gather data on the size of the empire's population, geographical area and wealth.

The use of statistical methods dates back to at least the 5th century BCE. The historian Thucydides in his History of the Peloponnesian War[1] describes how the Athenians calculated the height of the wall of Platea by counting the number of bricks in an unplastered section of the wall sufficiently near them to be able to count them. The count was repeated several times by a number of soldiers. The most frequent value (in modern terminology - the mode ) so determined was taken to be the most likely value of the number of bricks. Multiplying this value by the height of the bricks used in the wall allowed the Athenians to determine the height of the ladders necessary to scale the walls.[citation needed]

Forms of probability and statistics were developed by Al-Khalil (717–786 CE), an Arab mathematician studying cryptology. He wrote the Book of Cryptographic Messages which contains the first use of permutations and combinations to list all possible Arabic words with and without vowels.[2]

The earliest writing on statistics was found in a 9th-century Arabic book entitled Manuscript on Deciphering Cryptographic Messages, written by Al-Kindi (801–873). In his book, Al-Kindi gave a detailed description of how to use statistics and frequency analysis to decipher encrypted messages. This text arguably gave rise to the birth of both statistics and cryptanalysis.[3][4] Al-Kindi also made the earliest known use of statistical inference, while he and other Arab cryptologists developed the early statistical methods for decoding encrypted messages. An important contribution of Ibn Adlan (1187–1268) was on sample size for use of frequency analysis.[2] The book covers methods of cryptanalysis, encipherments, cryptanalysis of certain encipherments, and statistical analysis of letters and letter combinations in Arabic.[5]

In the early 11th century, Al-Biruni's scientific method emphasized repeated experimentation. Biruni was concerned with how to conceptualize and prevent both systematic errors and observational biases, such as "errors caused by the use of small instruments and errors made by human observers." He argued that if instruments produce errors because of their imperfections or idiosyncratic qualities, then multiple observations must be taken, analyzed qualitatively, and on this basis, arrive at a "common-sense single value for the constant sought", whether an arithmetic mean or a "reliable estimate."[6]

The Trial of the Pyx is a test of the purity of the coinage of the Royal Mint which has been held on a regular basis since the 12th century. The Trial itself is based on statistical sampling methods. After minting a series of coins - originally from ten pounds of silver - a single coin was placed in the Pyx - a box in Westminster Abbey. After a given period - now once a year - the coins are removed and weighed. A sample of coins removed from the box are then tested for purity.

The Nuova Cronica, a 14th-century history of Florence by the Florentine banker and official Giovanni Villani, includes much statistical information on population, ordinances, commerce and trade, education, and religious facilities and has been described as the first introduction of statistics as a positive element in history,[7] though neither the term nor the concept of statistics as a specific field yet existed. But this was proven to be incorrect after the rediscovery of Al-Kindi's book on frequency analysis.[3][4]

The arithmetic mean, although a concept known to the Greeks, was not generalised to more than two values until the 16th century. The invention of the decimal system by Simon Stevin in 1585 seems likely to have facilitated these calculations. This method was first adopted in astronomy by Tycho Brahe who was attempting to reduce the errors in his estimates of the locations of various celestial bodies.

The idea of the median originated in Edward Wright's book on navigation (Certaine Errors in Navigation) in 1599 in a section concerning the determination of location with a compass. Wright felt that this value was the most likely to be the correct value in a series of observations.

In 1662, John Graunt, along with William Petty, developed early human statistical and census methods that provided a framework for modern demography. He produced the first life table, giving probabilities of survival to each age. His book Natural and Political Observations Made upon the Bills of Mortality used analysis of the mortality rolls to make the first statistically based estimation of the population of London. He knew that there were around 13,000 funerals per year in London and that three people died per eleven families per year. He estimated from the parish records that the average family size was 8 and calculated that the population of London was about 384,000; this is the first known use of a ratio estimator. Laplace in 1802 estimated the population of France with a similar method.

Although the original scope of statistics was limited to data useful for governance, the approach was extended to many fields of a scientific or commercial nature during the 19th century. The mathematical foundations for the subject heavily drew on the new probability theory, pioneered in the 16th century by Gerolamo Cardano, Pierre de Fermat and Blaise Pascal. Christiaan Huygens (1657) gave the earliest known scientific treatment of the subject. Jakob Bernoulli's Ars Conjectandi (posthumous, 1713) and Abraham de Moivre's The Doctrine of Chances (1718) treated the subject as a branch of mathematics. In his book Bernoulli introduced the idea of representing complete certainty as one and probability as a number between zero and one.

A key early application of statistics in the 18th century was to the human sex ratio at birth.[8] John Arbuthnot studied this question in 1710.[9][10][11][12] Arbuthnot examined birth records in London for each of the 82 years from 1629 to 1710. In every year, the number of males born in London exceeded the number of females. Considering more male or more female births as equally likely, the probability of the observed outcome is 0.5^82, or about 1 in 4,8360,0000,0000,0000,0000,0000; in modern terms, the p-value. This is vanishingly small, leading Arbuthnot that this was not due to chance, but to divine providence: "From whence it follows, that it is Art, not Chance, that governs." This is and other work by Arbuthnot is credited as "the first use of significance tests"[13] the first example of reasoning about statistical significance and moral certainty,[14] and "… perhaps the first published report of a nonparametric test …",[10] specifically the sign test.

The formal study of theory of errors may be traced back to Roger Cotes' Opera Miscellanea (posthumous, 1722), but a memoir prepared by Thomas Simpson in 1755 (printed 1756) first applied the theory to the discussion of errors of observation. The reprint (1757) of this memoir lays down the axioms that positive and negative errors are equally probable, and that there are certain assignable limits within which all errors may be supposed to fall; continuous errors are discussed and a probability curve is given. Simpson discussed several possible distributions of error. He first considered the uniform distribution and then the discrete symmetric triangular distribution followed by the continuous symmetric triangle distribution. Tobias Mayer, in his study of the libration of the moon (Kosmographische Nachrichten, Nuremberg, 1750), invented the first formal method for estimating the unknown quantities by generalized the averaging of observations under identical circumstances to the averaging of groups of similar equations.

Roger Joseph Boscovich in 1755 based in his work on the shape of the earth proposed in his book De Litteraria expeditione per pontificiam ditionem ad dimetiendos duos meridiani gradus a PP. Maire et Boscovicli that the true value of a series of observations would be that which minimises the sum of absolute errors. In modern terminology this value is the median. The first example of what later became known as the normal curve was studied by Abraham de Moivre who plotted this curve on November 12, 1733.[15] de Moivre was studying the number of heads that occurred when a 'fair' coin was tossed.

In 1761 Thomas Bayes proved Bayes' theorem and in 1765 Joseph Priestley invented the first timeline charts.

Johann Heinrich Lambert in his 1765 book Anlage zur Architectonic proposed the semicircle as a distribution of errors:

with -1 < x < 1.

Laplace distribution pdf

Probability density plots for the Laplace distribution.

Pierre-Simon Laplace (1774) made the first attempt to deduce a rule for the combination of observations from the principles of the theory of probabilities. He represented the law of probability of errors by a curve and deduced a formula for the mean of three observations.

Laplace in 1774 noted that the frequency of an error could be expressed as an exponential function of its magnitude once its sign was disregarded.[16][17] This distribution is now known as the Laplace distribution. Lagrange proposed a parabolic fractal distribution of errors in 1776.

Laplace in 1778 published his second law of errors wherein he noted that the frequency of an error was proportional to the exponential of the square of its magnitude. This was subsequently rediscovered by Gauss (possibly in 1795) and is now best known as the normal distribution which is of central importance in statistics.[18] This distribution was first referred to as the normal distribution by C. S. Peirce in 1873 who was studying measurement errors when an object was dropped onto a wooden base.[19] He chose the term normal because of its frequent occurrence in naturally occurring variables.

Lagrange also suggested in 1781 two other distributions for errors - a raised cosine distribution and a logarithmic distribution.

Laplace gave (1781) a formula for the law of facility of error (a term due to Joseph Louis Lagrange, 1774), but one which led to unmanageable equations. Daniel Bernoulli (1778) introduced the principle of the maximum product of the probabilities of a system of concurrent errors.

In 1786 William Playfair (1759-1823) introduced the idea of graphical representation into statistics. He invented the line chart, bar chart and histogram and incorporated them into his works on economics, the Commercial and Political Atlas. This was followed in 1795 by his invention of the pie chart and circle chart which he used to display the evolution of England's imports and exports. These latter charts came to general attention when he published examples in his Statistical Breviary in 1801.

Laplace, in an investigation of the motions of Saturn and Jupiter in 1787, generalized Mayer's method by using different linear combinations of a single group of equations.

In 1791 Sir John Sinclair introduced the term 'statistics' into English in his Statistical Accounts of Scotland.

In 1802 Laplace estimated the population of France to be 28,328,612.[20] He calculated this figure using the number of births in the previous year and census data for three communities. The census data of these communities showed that they had 2,037,615 persons and that the number of births were 71,866. Assuming that these samples were representative of France, Laplace produced his estimate for the entire population.

The method of least squares, which was used to minimize errors in data measurement, was published independently by Adrien-Marie Legendre (1805), Robert Adrain (1808), and Carl Friedrich Gauss (1809). Gauss had used the method in his famous 1801 prediction of the location of the dwarf planet Ceres. The observations that Gauss based his calculations on were made by the Italian monk Piazzi.

The method of least squares was preceded by the use a median regression slope. This method minimizing the sum of the absolute deviances. A method of estimating this slope was invented by Roger Joseph Boscovich in 1760 which he applied to astronomy.

The term probable error (der wahrscheinliche Fehler) - the median deviation from the mean - was introduced in 1815 by the German astronomer Frederik Wilhelm Bessel. Antoine Augustin Cournot in 1843 was the first to use the term median (valeur médiane) for the value that divides a probability distribution into two equal halves.

Other contributors to the theory of errors were Ellis (1844), De Morgan (1864), Glaisher (1872), and Giovanni Schiaparelli (1875).[citation needed] Peters's (1856) formula for , the "probable error" of a single observation was widely used and inspired early robust statistics (resistant to outliers: see Peirce's criterion).

In the 19th century authors on statistical theory included Laplace, S. Lacroix (1816), Littrow (1833), Dedekind (1860), Helmert (1872), Laurent (1873), Liagre, Didion, De Morgan and Boole.

Gustav Theodor Fechner used the median (Centralwerth) in sociological and psychological phenomena.[21] It had earlier been used only in astronomy and related fields. Francis Galton used the English term median for the first time in 1881 having earlier used the terms middle-most value in 1869 and the medium in 1880.[22]

Adolphe Quetelet (1796–1874), another important founder of statistics, introduced the notion of the "average man" (l'homme moyen) as a means of understanding complex social phenomena such as crime rates, marriage rates, and suicide rates.[23]

The first tests of the normal distribution were invented by the German statistician Wilhelm Lexis in the 1870s. The only data sets available to him that he was able to show were normally distributed were birth rates.

Development of modern statistics[]

Although the origins of statistical theory lie in the 18th-century advances in probability, the modern field of statistics only emerged in the late-19th and early-20th century in three stages. The first wave, at the turn of the century, was led by the work of Francis Galton and Karl Pearson, who transformed statistics into a rigorous mathematical discipline used for analysis, not just in science, but in industry and politics as well. The second wave of the 1910s and 20s was initiated by William Sealy Gosset, and reached its culmination in the insights of Ronald Fisher. This involved the development of better design of experiments models, hypothesis testing and techniques for use with small data samples. The final wave, which mainly saw the refinement and expansion of earlier developments, emerged from the collaborative work between Egon Pearson and Jerzy Neyman in the 1930s.[24] Today, statistical methods are applied in all fields that involve decision making, for making accurate inferences from a collated body of data and for making decisions in the face of uncertainty based on statistical methodology.

The first statistical bodies were established in the early 19th century. The Royal Statistical Society was founded in 1834 and Florence Nightingale, its first female member, pioneered the application of statistical analysis to health problems for the furtherance of epidemiological understanding and public health practice. However, the methods then used would not be considered as modern statistics today.

The Oxford scholar Francis Ysidro Edgeworth's book, Metretike: or The Method of Measuring Probability and Utility (1887) dealt with probability as the basis of inductive reasoning, and his later works focused on the 'philosophy of chance'.[25] His first paper on statistics (1883) explored the law of error (normal distribution), and his Methods of Statistics (1885) introduced an early version of the t distribution, the Edgeworth expansion, the Edgeworth series, the method of variate transformation and the asymptotic theory of maximum likelihood estimates.

The Norwegian Anders Nicolai Kiær introduced the concept of stratified sampling in 1895.[26] Arthur Lyon Bowley introduced new methods of data sampling in 1906 when working on social statistics. Although statistical surveys of social conditions had started with Charles Booth's "Life and Labour of the People in London" (1889-1903) and Seebohm Rowntree's "Poverty, A Study of Town Life" (1901), Bowley's, key innovation consisted of the use of random sampling techniques. His efforts culminated in his New Survey of London Life and Labour.[27]

Francis Galton is credited as one of the principal founders of statistical theory. His contributions to the field included introducing the concepts of standard deviation, correlation, regression and the application of these methods to the study of the variety of human characteristics - height, weight, eyelash length among others. He found that many of these could be fitted to a normal curve distribution.[28]

Galton submitted a paper to Nature in 1907 on the usefulness of the median.[29] He examined the accuracy of 787 guesses of the weight of an ox at a country fair. The actual weight was 1208 pounds: the median guess was 1198. The guesses were markedly non-normally distributed.

Galton's publication of Natural Inheritance in 1889 sparked the interest of a brilliant mathematician, Karl Pearson,[30] then working at University College London, and he went on to found the discipline of mathematical statistics.[31] He emphasised the statistical foundation of scientific laws and promoted its study and his laboratory attracted students from around the world attracted by his new methods of analysis, including Udny Yule. His work grew to encompass the fields of biology, epidemiology, anthropometry, medicine and social history. In 1901, with Walter Weldon, founder of biometry, and Galton, he founded the journal Biometrika as the first journal of mathematical statistics and biometry.

His work, and that of Galton's, underpins many of the 'classical' statistical methods which are in common use today, including the Correlation coefficient, defined as a product-moment;[32] the method of moments for the fitting of distributions to samples; Pearson's system of continuous curves that forms the basis of the now conventional continuous probability distributions; Chi distance a precursor and special case of the Mahalanobis distance[33] and P-value, defined as the probability measure of the complement of the ball with the hypothesized value as center point and chi distance as radius.[33] He also introduced the term 'standard deviation'.

He also founded the statistical hypothesis testing theory,[33] Pearson's chi-squared test and principal component analysis.[34][35] In 1911 he founded the world's first university statistics department at University College London.

The second wave of mathematical statistics was pioneered by Ronald Fisher who wrote two textbooks, Statistical Methods for Research Workers, published in 1925 and The Design of Experiments in 1935, that were to define the academic discipline in universities around the world. He also systematized previous results, putting them on a firm mathematical footing. In his 1918 seminal paper The Correlation between Relatives on the Supposition of Mendelian Inheritance, the first use to use the statistical term, variance. In 1919, at Rothamsted Experimental Station he started a major study of the extensive collections of data recorded over many years. This resulted in a series of reports under the general title Studies in Crop Variation. In 1930 he published The Genetical Theory of Natural Selection where he applied statistics to evolution.

Over the next seven years, he pioneered the principles of the design of experiments (see below) and elaborated his studies of analysis of variance. He furthered his studies of the statistics of small samples. Perhaps even more important, he began his systematic approach of the analysis of real data as the springboard for the development of new statistical methods. He developed computational algorithms for analyzing data from his balanced experimental designs. In 1925, this work resulted in the publication of his first book, Statistical Methods for Research Workers.[36] This book went through many editions and translations in later years, and it became the standard reference work for scientists in many disciplines. In 1935, this book was followed by The Design of Experiments, which was also widely used.

In addition to analysis of variance, Fisher named and promoted the method of maximum likelihood estimation. Fisher also originated the concepts of sufficiency, ancillary statistics, Fisher's linear discriminator and Fisher information. His article On a distribution yielding the error functions of several well known statistics (1924) presented Pearson's chi-squared test and William Sealy Gosset's t in the same framework as the Gaussian distribution, and his own parameter in the analysis of variance Fisher's z-distribution (more commonly used decades later in the form of the F distribution).[37] The 5% level of significance appears to have been introduced by Fisher in 1925.[38] Fisher stated that deviations exceeding twice the standard deviation are regarded as significant. Before this deviations exceeding three times the probable error were considered significant. For a symmetrical distribution the probable error is half the interquartile range. For a normal distribution the probable error is approximately 2/3 the standard deviation. It appears that Fisher's 5% criterion was rooted in previous practice.

Other important contributions at this time included Charles Spearman's rank correlation coefficient that was a useful extension of the Pearson correlation coefficient. William Sealy Gosset, the English statistician better known under his pseudonym of Student, introduced Student's t-distribution, a continuous probability distribution useful in situations where the sample size is small and population standard deviation is unknown.

Egon Pearson (Karl's son) and Jerzy Neyman introduced the concepts of "Type II" error, power of a test and confidence intervals. Jerzy Neyman in 1934 showed that stratified random sampling was in general a better method of estimation than purposive (quota) sampling.[39]

Important contributors to statistics[]

References[]

  1. Thucydides (1985). History of the Peloponnesian War. New York: Penguin Books, Ltd.. pp. 204. 
  2. 2.0 2.1 Broemeling, Lyle D. (1 November 2011). "An Account of Early Statistical Inference in Arab Cryptology". The American Statistician 65 (4): 255–257. doi:10.1198/tas.2011.10191. 
  3. 3.0 3.1 Singh, Simon (2000). The code book : the science of secrecy from ancient Egypt to quantum cryptography (1st Anchor Books ed.). New York: Anchor Books. ISBN 978-0-385-49532-5. 
  4. 4.0 4.1 Ibrahim A. Al-Kadi "The origins of cryptology: The Arab contributions", Cryptologia, 16(2) (April 1992) pp. 97–126.
  5. "Al-Kindi, Cryptgraphy, Codebreaking and Ciphers". Retrieved 2007-01-12.
  6. Glick, Thomas F.; Livesey, Steven John; Wallis, Faith (2005), Medieval Science, Technology, and Medicine: An Encyclopedia, Routledge, pp. 89–90, ISBN 0-415-96930-1 
  7. Villani, Giovanni. Encyclopædia Britannica. Encyclopædia Britannica 2006 Ultimate Reference Suite DVD. Retrieved on 2008-03-04.
  8. Brian, Éric; Jaisson, Marie (2007). "Physico-Theology and Mathematics (1710–1794)". The Descent of Human Sex Ratio at Birth. Springer Science & Business Media. pp. 1–25. ISBN 978-1-4020-6036-6. 
  9. John Arbuthnot (1710). "An argument for Divine Providence, taken from the constant regularity observed in the births of both sexes". Philosophical Transactions of the Royal Society of London 27 (325–336): 186–190. doi:10.1098/rstl.1710.0011. http://www.york.ac.uk/depts/maths/histstat/arbuthnot.pdf. 
  10. 10.0 10.1 Conover, W.J. (1999), "Chapter 3.4: The Sign Test", Practical Nonparametric Statistics (Third ed.), Wiley, pp. 157–176, ISBN 978-0-471-16068-7 
  11. Sprent, P. (1989), Applied Nonparametric Statistical Methods (Second ed.), Chapman & Hall, ISBN 978-0-412-44980-2 
  12. Stigler, Stephen M. (1986). The History of Statistics: The Measurement of Uncertainty Before 1900. Harvard University Press. pp. 225–226. ISBN 978-0-67440341-3. 
  13. Bellhouse, P. (2001), "John Arbuthnot", in Statisticians of the Centuries by C.C. Heyde and E. Seneta, Springer, pp. 39–42, ISBN 978-0-387-95329-8 
  14. Hald, Anders (1998), "Chapter 4. Chance or Design: Tests of Significance", A History of Mathematical Statistics from 1750 to 1930, Wiley, pp. 65 
  15. de Moivre, A. (1738) The doctrine of chances. Woodfall
  16. Laplace, P-S (1774). "Mémoire sur la probabilité des causes par les évènements". Mémoires de l'Académie Royale des Sciences Présentés par Divers Savants 6: 621–656. 
  17. Wilson, Edwin Bidwell (1923) "First and second laws of error", Journal of the American Statistical Association, 18 (143), 841-851 JSTOR 2965467
  18. Havil J (2003) Gamma: Exploring Euler's Constant. Princeton, NJ: Princeton University Press, p. 157
  19. C. S. Peirce (1873) Theory of errors of observations. Report of the Superintendent US Coast Survey, Washington, Government Printing Office. Appendix no. 21: 200-224
  20. Cochran W.G. (1978) "Laplace's ratio estimators". pp 3-10. In David H.A., (ed). Contributions to Survey Sampling and Applied Statistics: papers in honor of H. O. Hartley. Academic Press, New York ISBN 978-1483237930
  21. Keynes, JM (1921) A treatise on probability. Pt II Ch XVII §5 (p 201)
  22. Galton F (1881) Report of the Anthropometric Committee pp 245-260. Report of the 51st Meeting of the British Association for the Advancement of Science
  23. Stigler (1986, Chapter 5: Quetelet's Two Attempts)
  24. Helen Mary Walker (1975). Studies in the history of statistical method. Arno Press. ISBN 9780405066283. https://books.google.com/books?id=jYFRAAAAMAAJ. 
  25. (Stigler 1986, Chapter 9: The Next Generation: Edgeworth)
  26. Bellhouse DR (1988) A brief history of random sampling methods. Handbook of statistics. Vol 6 pp 1-14 Elsevier
  27. Bowley, AL (1906). "Address to the Economic Science and Statistics Section of the British Association for the Advancement of Science". J R Stat Soc 69: 548–557. doi:10.2307/2339344. JSTOR 2339344. https://zenodo.org/record/2152740. 
  28. Galton, F (1877). "Typical laws of heredity". Nature 15 (388): 492–553. doi:10.1038/015492a0. 
  29. Galton, F (1907). "One Vote, One Value". Nature 75 (1948): 414. doi:10.1038/075414a0. https://zenodo.org/record/2125755. 
  30. Stigler (1986, Chapter 10: Pearson and Yule)
  31. Varberg, Dale E. (1963). "The development of modern statistics". The Mathematics Teacher 56 (4): 252–257. JSTOR 27956805. 
  32. Stigler, S. M. (1989). "Francis Galton's Account of the Invention of Correlation". Statistical Science 4 (2): 73–79. doi:10.1214/ss/1177012580. 
  33. 33.0 33.1 33.2 Pearson, K. (1900). "On the Criterion that a given System of Deviations from the Probable in the Case of a Correlated System of Variables is such that it can be reasonably supposed to have arisen from Random Sampling". Philosophical Magazine. Series 5 50 (302): 157–175. doi:10.1080/14786440009463897. https://zenodo.org/record/1430618. 
  34. Pearson, K. (1901). "On Lines and Planes of Closest Fit to Systems of Points is Space". Philosophical Magazine. Series 6 2 (11): 559–572. doi:10.1080/14786440109462720. https://zenodo.org/record/1430636. 
  35. Jolliffe, I. T. (2002). Principal Component Analysis, 2nd ed. New York: Springer-Verlag.
  36. Box, R. A. Fisher, pp 93–166
  37. Agresti, Alan; David B. Hichcock (2005). "Bayesian Inference for Categorical Data Analysis". Statistical Methods & Applications 14 (3): 298. doi:10.1007/s10260-005-0121-y. http://www.stat.ufl.edu/~aa/articles/agresti_hitchcock_2005.pdf. 
  38. Fisher RA (1925) Statistical methods for research workers, Edinburgh: Oliver & Boyd
  39. Neyman, J (1934) On the two different aspects of the representative method: The method of stratified sampling and the method of purposive selection. Journal of the Royal Statistical Society 97 (4) 557-625 JSTOR 2342192

Bibliography[]

  • Freedman, D. (1999). "From association to causation: Some remarks on the history of statistics". Statistical Science 14 (3): 243–258. doi:10.1214/ss/1009212409.  (Revised version, 2002)
  • Hald, Anders (2003). A History of Probability and Statistics and Their Applications before 1750. Hoboken, NJ: Wiley. ISBN 978-0-471-47129-5. 
  • Hald, Anders (1998). A History of Mathematical Statistics from 1750 to 1930. New York: Wiley. ISBN 978-0-471-17912-2. 
  • Kotz, S., Johnson, N.L. (1992,1992,1997). Breakthroughs in Statistics, Vols I, II, III. Springer ISBN 0-387-94037-5, ISBN 0-387-94039-1, ISBN 0-387-94989-5
  • Pearson, Egon (1978). The History of Statistics in the 17th and 18th Centuries against the changing background of intellectual, scientific and religious thought (Lectures by Karl Pearson given at University College London during the academic sessions 1921-1933). New York: MacMillan Publishing Co., Inc.. pp. 744. ISBN 978-0-02-850120-8. 
  • Salsburg, David (2001). The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century. ISBN 0-7167-4106-7
  • Stigler, Stephen M. (1986). The History of Statistics: The Measurement of Uncertainty before 1900. Belknap Press/Harvard University Press. ISBN 978-0-674-40341-3. 
  • Stigler, Stephen M. (1999) Statistics on the Table: The History of Statistical Concepts and Methods. Harvard University Press. ISBN 0-674-83601-4
  • David, H. A. (1995). "First (?) Occurrence of Common Terms in Mathematical Statistics". The American Statistician 49 (2): 121–133. doi:10.2307/2684625. JSTOR 2684625. 

External links[]