The described need to understand random phenomena and to make adequate decisions when confronted with uncertainty has been recognised by many educational authorities. Consequently, the teaching of probability is included in curricula in many countries during primary or secondary education. An important area of research in probability education is the analysis of curricular guidelines and curricular materials, such as textbooks. Both topics are now commented on in turn. Describe events as likely or unlikely and discuss the degree of likelihood using such words as certain, equally likely, and impossible;.

Predict the probability of outcomes of simple experiments and test the predictions;. Understand that the measure of the likelihood of an event can be represented by a number from 0 to 1. Their recommendations have been reproduced in other curricular guidelines for Primary school. Today, some curricula include probability from the first or second levels of primary education e.

In the case of Mexico, for example, probability was postponed to the middle school level on the argument that primary school teachers have many difficulties in understanding probability and therefore are not well prepared to teach the topic. This change does not take into account, however, the relevance of educating probabilistic reasoning in young children, which was emphasised by Fischbein , or the multiple connections between probability and other areas of mathematics as stated in the Guidelines for Assessment and Instruction in Statistics Education GAISE for pre-K levels Franklin et al.

It is a part of mathematics that enriches the subject as a whole by its interactions with other uses of mathematics. Understand and use appropriate terminology to describe complementary and mutually exclusive events;. Use proportionality and a basic understanding of probability to make and test conjectures about the results of experiments and simulations;. Compute probabilities for simple compound events, using such methods as organised lists, tree diagrams, and area models. Understand the concepts of sample space and probability distribution and construct sample spaces and distributions in simple cases;.

Compute and interpret the expected value of random variables in simple cases;. Understand the concepts of conditional probability and independent events;. In Mexico, there are different high school strands; in most of them a compulsory course in probability and statistics is included. In France, the main statistical content in the last year of high school terminale , year-olds is statistical inference, e. Research shows the coexistence of different interpretations as well as misconceptions held by students and suggests the need to reinforce understanding of randomness in students Batanero Events and sample space.

Some children only concentrate on a single event since their thinking is mainly deterministic Langrall and Mooney It is then important that children understand the need to take into account all different possible outcomes in an experiment to compute its probability. Combinatorial enumeration and counting. Combinatorics is used in listing all the events in a sample space or in counting without listing all its elements. Although in the frequentist approach we do not need combinatorics to estimate the value of probability, combinatorial reasoning is nevertheless needed in other situations, for example, to understand how events in a compound experiment are formed or to understand how different samples of the same size can be selected from a population.

Combinatorial reasoning is difficult; however, it is possible to use tools such as tree diagrams to help students reinforce this particular type of reasoning. Independence and conditional probability. The notion of independence is important to understand simulations and the empirical estimates of probability via frequency, since when repeating experiments we require independence of trials.

Computing probabilities in compound experiments requires one to analyse whether the experiments are dependent or not. Finally, the idea of conditional probability is needed to understand many concepts in probability and statistics, such as confidence intervals or hypotheses tests. Probability distribution and expectation. Although there is abundant research related to distribution, most of this research concentrates on data distribution or in sampling distribution.

Another type of distribution is linked to the random variable, a powerful idea in probability, as well as the associated idea of expectation. Some probability distribution models in wide use are the binomial, uniform, and normal distributions. Convergence and laws of large numbers. The progressive stabilization of the relative frequency of a given outcome in a large number of trials has been observed for centuries; Bernoulli proved the first version of the law of large numbers that justified the frequentist definition of probability.

Today the frequentist approach, where probability is an estimate of the relative frequency of a result in a long series of trials, is promoted in teaching. It is important that students understand that each outcome is unpredictable and that regularity is only achieved in the long run. At the same time, older students should be able to discriminate between a frequency estimate a value that varies and probability which is always a theoretical value Chaput et al.

Sampling and sampling distribution. Given that we are rarely able to study complete populations, our knowledge of a population is based on samples. Students are required to understand the ideas of sample representativeness and sampling variability. The sampling distributions describe the variation of a summary measure e. Instead of using the exact sampling distribution e. This is a suitable teaching strategy, but teachers should be conscious that, as any estimate, the empirical sampling distribution only approximates the theoretical sampling distribution.

Modelling and simulation. Today we witness increasing recommendations to approach the teaching of probability from the point of view of modelling Chaput et al. Simulation allows the exploration of probability concepts and properties, and is used in informal approaches to inference. Simulation acts as an intermediary step between reality and the mathematical model. When teaching probability it is important to take into account the informal ideas that children and adolescents assign to chance and probability before instruction. These ideas are described in the breadth and depth of research investigating probabilistic intuitions, informal notions of probability, and resulting learning difficulties.

We now revisit the essentials associated with probabilistic intuition and difficulties associated with learning probability. Initial research in probability cognition was undertaken during the s and s by Piaget and Inhelder and by psychologists with varying theoretical orientations Jones and Thornton Alternatively stated, research investigating intuition and learning difficulties was central at the beginnings of research in probabilistic thinking and would continue on into the next historical phase of research.

The work of Fischbein would continue the work of Piaget and Inhelder i. As mentioned, other investigations involving intuition were occurring in the field of psychology during this period, using different terminology. Their research revealed numerous heuristics e. This research program played a key role in shaping many other fields of research see, for example, behavioural economics. In the field of mathematics education, the research of Shaughnessy , brought forth not only the theoretical ideas of Tversky and Kahneman, but also, in essence, research on probabilistic intuitions and learning difficulties.

Although not explicitly deemed as intuitions and difficulties, work in this general area of research was conducted by a number of different individuals. As the Post-piagetian Period came to a close, the field of mathematics education began to see an increasing volume of research on intuitions and learning difficulties e. Moving from one period to the next, research into probabilistic intuitions and learning difficulties would come into its own during what Jones called Phase Three : Contemporary Research. During this new phase there was, arguably, a major shift towards investigating curriculum and instruction, and the leadership of investigating probabilistic intuitions and learning difficulties was carried on by a particular group of researchers.

Worthy of note, mathematics education researchers in this phase, as the case with Konold , and Falk in the previous phase, began to develop their own theories, frameworks, and models associated with responses to a variety of probabilistic tasks. These theories, frameworks, and models were developed during research that investigated a variety of topics in probability, which included difficulties associated with : randomness e.

Worthy of note, the term misconceptio n, which acted as the de facto terminology for a number of years, has more recently evolved to preconceptions and other variants, which are perhaps better aligned with other theories in the field of mathematics education.

In line with the above, research developing theories, models, and frameworks associated with intuition and learning difficulties continued into the next phase of research, which Chernoff and Sriraman a have prematurely called the Assimilation Period. Gone are the early days where researchers were attempting to replicate research found in different fields, such as psychology e. With that said, researchers are attempting to import theories, models, and frameworks from other fields; however, researchers in the field of mathematics education are forging their own interpretations of results stemming from the intuitive nature and difficulties associated with probability thinking and the teaching and learning of probability.

Theories, models, and frameworks such as inadvertent metonymy Abrahamson , , sample space partitions Chernoff , a , b , and others demonstrate that research into intuitions and difficulties continues in the field of mathematics education. This does not mean, however, that the field does not continue to look to other domains of research to help better inform mathematics education. For example, recent investigations e. Similar investigations embracing research from other fields have opened the door to alternative views of heuristics, intuitions, and learning difficulties, such as in the work by Gigerenzer and the Adaptive Behavior and Cognition ABC Group at the Max Planck Institute for Human Development in Berlin e.

Based on these developments, the field of mathematics education is starting to also develop particular research which is building upon and questioning certain aspects of probabilistic intuitions and learning difficulties. For example, Chernoff, in a recent string of studies e. In considering how students reason about probability, advances in technology and other educational resources have allowed for another important area of research, as described in the next section. Many educational resources have been used to support probability education.

Some of the most common resources include physical devices such as dice, coins, spinners, marbles in a bag, and a Galton board that help create game-like scenarios that involve chance Nilsson These devices are often used to support a classical approach to probability for computing the probability of an event occurring a priori by examining the object and making assumptions about symmetry that often lead to equiprobable outcomes for a single trial.

When used together e. Organizational tools such as two-by-two tables and tree diagrams are also used to assist in enumerating sample spaces Nunes et al. Since physical devices can also be acted upon, curriculum resources and teachers have increased the use of experiments with these devices to induce chance events e. These frequencies and relative frequencies are used as an estimate of probability in the frequentist perspective, then often compared to the a priori computed probability based on examination of the object.

Of course, if the modelling meaning of probability was stressed in the curriculum, it is debatable whether there is much advantage in maintaining the current emphasis on coins, spinners, dice and balls drawn from a bag. Perhaps, in days gone by when children played board games, there was some natural relevance in such contexts but, now that games take place in real time on screens, probability has much more relevance as a tool for modelling computer-based action and for simulating real-world events and phenomena p.

One way to help students use probability to model real-world phenomena is to engage the necessity to make a model explicit when using technology. Sampling, storing, organising, and analysing data generated from a probabilistic model are facilitated tremendously by technology.

## STATISTICS AND PROBABILITY (TEACHING GUIDE)

These recommendations have been used by many researchers and have recently been made explicit for recommendations for teachers by Lee and Lee and for researchers by Pratt and Ainley and Pratt et al. One major contribution of technology to the study of probability is the ability to generate a large sample of data very quickly, store data in raw form as a sequence of outputs or organised table of frequencies, and collapse data into various aggregate representations. Several researchers have discussed what students notice about sample size, particularly when they are able to examine its impact on variability in data distributions e.

The ability of technology tools such as Probability Explorer or Fathom to store long lists of data sequences can also afford opportunities for students to examine a history of outcomes in the order in which they occurred, as well as conveniently collapsed in frequency and relative frequency tables and graphs. Technology tools bring real power to classrooms by allowing students to rapidly generate a large amount of data, quickly create tabular and graphical representations, and perform computations on data with ease.

Students can then spend more time focused on making sense of data in various representations. Different representations of data in aggregate form can afford different perspectives on a problem. In addition, technology facilitates quickly generating, storing, and comparing multiple samples, each consisting of as many trials as desired. Instead of having each student in a class collect a small amount of data and then pool the data to form a class aggregate, students or small groups can generate individual samples of data, both small and large, and engage in reasoning about their own data as well as contribute to class discussions that examine results across samples Stohl and Tarr Technology can also provide opportunities for students to make sense of how an empirical distribution changes as data is being collected.

This dynamic view of a distribution can assist students in exploring the relationship between a probability model and a resulting empirical distribution and how sample size impacts the variability observed Drier ; Pratt et al. This relies upon intuitions about laws of large numbers; such intuitions may be strengthened by observing the settling down of relative frequency distributions as trials are simulated and displayed in real time. It is important to investigate the opportunities that technology affords for teachers and students to discuss explicitly the assumptions needed to build models to describe real-world scenarios through simulation.

The model-building process should include discussing the pertinent characteristics of a situation being modelled, while in the same way simplifying reality Chaput et al. Such steps are opportunities for students to grow in their understandings of a situation, the model, and many probability ideas. Using various technology tools e. Further, this can afford opportunities for discussing why two different ways of modelling a situation may be similar or different and differentiating between model and reality. Modelling may also be as complex as interpreting medical information using input probabilities from real-life scenarios such as success or complications from back surgery to help a patient make an important life decision e.

Several tasks and tools discussed by Prodromou and Pratt and Ainley illustrate the importance of the ability to adjust parameters in a model-building and model-fitting process. It allows for a process of model development, data generation and analysis, and model adjustment. Konold and Kazak , for example, described how this process helped students in creating and modifying models based on how well a particular model behaved when a simulation was run.

Engaging with probability models and the data generated from such models can provide very important foundations for how probability is used in statistics, particularly in making inferences about populations and testing hypotheses. In the above cases, the models used in a simulation are typically created by students or created by a teacher but open for inspection by students. However, technology tools afford the ability to hide a model from the user such that underlying probability distributions that control a simulation are unknowable.

One example of such inference and decision-making situations can be found in the work of Lee et al. In summary, technology provides a big opportunity for probability education but also sets some challenges. One of them is the education of teachers to teach probability in general and to use technology in their teaching of probability in particular.

We deal with this specific issue in the last section of our survey. Although school textbooks provide examples and teaching resources, some texts present too narrow a view of probabilistic concepts or only one approach to probability. Existent models of the knowledge needed by teachers in mathematics education, such as mathematical knowledge for teaching MKT; Ball et al. However, as stated by Godino et al.

This means that any discussion of probability knowledge for teaching PKT should be supported in the specific features of probability. First, teachers need adequate probabilistic knowledge. However, even if prospective teachers have a degree in mathematics, they have usually only studied theoretical probability and lack experience in designing investigations or simulations to work with students Kvatinsky and Even ; Stohl The education of primary school teachers is even more challenging, because few of them have had suitable training in either theoretical or applied probability Franklin and Mewborn Moreover, recent research suggests that many prospective teachers share with their students common biases in probabilistic reasoning e.

A second component is the pedagogical knowledge needed to teach probability, where general principles valid for other areas of mathematics are not always appropriate Batanero et al. For example, in arithmetic or geometry, elementary operations can be reversed, and this reversibility can be represented by concrete materials, which serve to organise experiences where children progressively abstract the structure behind the concrete situation. The lack of reversibility in random experiences makes it more difficult for children to grasp the essential features of randomness, which may explain why they do not always develop correct probabilistic intuitions without a specific instruction.

In addition to the above, probability is difficult to teach because the teacher should not only present different probabilistic concepts and their applications but be aware of the different meanings of probability and philosophical controversies around them Batanero et al. The current use of technology warrants special considerations in the education of teachers that should be analysed. Lee and Hollebrands introduced a framework to describe what they call technological pedagogical statistical knowledge TPSK with examples of components in this knowledge.

The evaluation and development of components in this framework for the specific case of probability is a promising research area. Another line of research is designing and evaluating suitable and effective tasks that help in increasing the probabilistic and didactic knowledge of teachers. Some researchers describe different experiences directed towards achieving this goal.

Teachers should engage with and analyse probability simulations and investigations. Simulations and experiments are recommended when working with students. To be able to use investigations in their own classrooms, teachers need competencies with this approach to teaching. When the time available for educating teachers is scarce, one possibility is to give teachers first a project or investigation to work with and, when finished, carry out a didactical analysis of the project.

Teachers should engage with case discussions.

### The benefits...

Groth and Xu used case discussion among a group of teachers as a valuable strategy to educate teachers. The authors indicated that in teaching stochastics teachers navigate between two layers of uncertainty. On the one hand, uncertainty is part of stochastic knowledge; on the other hand, in any classroom uncertainty appears as a result of the dynamic interactions amongst teacher, students, and the topic being taught. Discussions among the teachers may help them to increase their knowledge since experiences with general pedagogy, mathematical content, and content-specific pedagogy can be offered and debated.

Teachers also need experience planning and analysing a lesson. When teachers plan and then analyse a lesson devised to teach some content they develop their probabilistic and professional knowledge Chick and Pierce Teachers need to understand the probability they teach to their students. One strategy is to have teachers play the role of a learner and afterwards analyse what they learnt. If they have the chance to go through a lesson as a learner and at the same time look at it from the point a view of a teacher, they may understand better how the lesson will unfold later in the classroom.

Teachers should have extensive experience working with technology. We can also capitalise on technology as a tool-builder for teachers gaining a conceptual understanding of probabilistic ideas. They also discuss how technology can provide teachers with first-hand experience about how these tools can be useful in improving their stochastic thinking and knowledge.

Other examples describe experiences and courses specifically directed to train teachers to teach probability or suggestions of how this training should be e. Research and development in teacher education related to probability education is still scarce and needs to be fostered. In the previous sections we analysed the multifaceted nature of probability, the probabilistic contents in curricula, research dealing with intuitions and misconceptions, the role of technology, and the education of teachers. To finish this survey we suggest some points where new research is needed. Different views on probability : As discussed in Sect.

Since different views of probability are complementary Henry , reducing teaching to just one approach may explain some learning difficulties, as students may consider or apply only one interpretation in situations where it is inappropriate. Probabilistic thinking and reasoning : Our analysis in Sect. Consequently, the teaching of probabilistic thinking is important and justified in its own right and not simply as a tool to pave the way to inferential methods of statistics.

Important research problems in this regard are: a clarifying the way in which probabilistic thinking could contribute to improving mathematical competencies of students, b analysing how different probability models and their applications can be presented to the students, c finding ways in which it is possible to engage students in questions related to how to obtain knowledge from data and why a probability model is suitable, and d how to help students develop valid intuitions in this field. Probability in school curricula : Another important area of research is to enquire about how the fundamental ideas of probability have been reflected in school curricula at different levels and in different countries.

The presentation of these ideas in textbooks for different curricular levels should also been taken into account following previous research, e. We also need to find different levels of formalisation to teach each of these ideas depending on age and previous knowledge of students. Consequently, it is important to reflect on the main ideas that students should acquire at different ages, appropriate teaching methods, and suitable teaching situations. Further, as the field grows and diversifies, there is reason to expect that this particular thread of research will not only continue but also grow.

We need to know more about how students construct models and how they reason with data generated from such models. It is also important to evaluate the impact of technology on recent curricula and on the education of teachers. There is also a need for more systematic research about how teachers and students use technology in classrooms and how large-scale assessment should respond to capture new meanings for probability that may emerge from students working with probability using technology tools. In conclusion, in this brief survey we have tried to summarise the extensive research in probability education.

At the same time we intended to convince our readers of the need for new research and the many different ideas that still need to be investigated. We hope to have achieved these goals and look forward to new research in probability education. The sample space may be finite, countably infinite or uncountably infinite. The sample space is countably infinite when it can be put in correspondence with the set N of natural numbers.

For finite or countably infinite sample spaces S , A includes all subsets of S. For uncountable continuous infinite sample spaces, the set algebra considered to assign probability does not include all possible subsets of S ; A is restricted instead to a system of events of S , closed under countable union and intersection operations and forming complements Chung The theoretical constructs adopted by Daniel Kahneman and Amos Tversky differed from those of Fischbein and colleagues e. Skip to main content Skip to sections. Advertisement Hide. Research on Teaching and Learning Probability.

Open Access. First Online: 13 July This process is experimental and the keywords may be updated as the learning algorithm improves. Download chapter PDF. Probability is any function defined from A in the interval of real numbers [0,1] that fulfils the following three axioms, from which many probability properties and theorems can be deduced: 1.

Our exposition suggests that the different views of probability described involve specific differences, not only in the definition of probability itself, but also in the related concepts, properties, and procedures that have emerged to solve various problems related to each view. Probability constitutes a distinct approach to thinking and reasoning about real-life phenomena.

Probabilistic reasoning is a mode of reasoning that refers to judgments and decision-making under uncertainty and is relevant to real life, for example, when evaluating risks Falk and Konold It is thinking in scenarios that allow for the exploration and evaluation of different possible outcomes in situations of uncertainty. Thus, probabilistic reasoning includes the ability to: Identify random events in nature, technology, and society; Analyse conditions of such events and derive appropriate modelling assumptions; Construct mathematical models for stochastic situations and explore various scenarios and outcomes from these models; and Apply mathematical methods and procedures of probability and statistics.

A helpful metaphor in this regard is to separate the signal the true causal difference from the noise the individual random variation Konold and Pollatsek Said authors characterise data analysis as the search for signals causal variations in noisy processes which include random variation. Borovcnik introduced the structural equation, which represents data as decomposed into a signal to be recovered and noise.

The structural equation is a core idea of modelling statistical data and is a metaphor for our human response to deal with an overwhelming magnitude of relevant and irrelevant information contained in observed data. How to separate causal from random sources of variation is by no means unique. Probability hereby acquires more the character of a heuristic tool to analyse reality.

Open image in new window. At the beginning of this century, the Standards of the National Council of Teachers of Mathematics NCTM in the United States included the following recommendations related to understanding and applying basic concepts of probability for children in Grades 3—5: Describe events as likely or unlikely and discuss the degree of likelihood using such words as certain, equally likely, and impossible; Predict the probability of outcomes of simple experiments and test the predictions; Understand that the measure of the likelihood of an event can be represented by a number from 0 to 1.

There has been a long tradition of teaching probability in middle and high school curricula where the topics taught include compound experiments and conditional probability. For example, the NCTM stated that students in Grades 6—8 should: Understand and use appropriate terminology to describe complementary and mutually exclusive events; Use proportionality and a basic understanding of probability to make and test conjectures about the results of experiments and simulations; Compute probabilities for simple compound events, using such methods as organised lists, tree diagrams, and area models.

In Grades 9—12 students should: Understand the concepts of sample space and probability distribution and construct sample spaces and distributions in simple cases; Use simulations to construct empirical probability distributions; Compute and interpret the expected value of random variables in simple cases; Understand the concepts of conditional probability and independent events; Understand how to compute the probability of a compound event.

A key point in teaching probability is to reflect on the main content that should be included at different educational levels. Heitele suggested a list of fundamental probabilistic concepts that played a key role in the history of probability and are the basis for the modern theory of probability. At the same time, people frequently hold incorrect intuitions about their meaning or application in absence of instruction. This list includes the ideas of random experiment and sample space, the addition and multiplication rule, independence and conditional probability, random variables and distribution, combinations and permutations, convergence, sampling, and simulation.

Below we briefly comment on some of these ideas, which were analysed by Batanero et al. Many researchers e. The sentiment is expressed by Pratt : Of course, if the modelling meaning of probability was stressed in the curriculum, it is debatable whether there is much advantage in maintaining the current emphasis on coins, spinners, dice and balls drawn from a bag. Abrahamson, D. Bridging theory: Activities designed to support the grounding of outcome-based combinatorial analysis in event-based intuitive judgment-A case study.

Pratt Eds. Monterrey, Mexico. Orchestrating semiotic leaps from tacit to cultural quantitative reasoning: the case of anticipating experimental outcomes of a quasi-binomial random generator. Cognition and Instruction, 27 3 , — CrossRef Google Scholar. The Australian curriculum: Mathematics. Sidney, NSW: Author. Randomness in textbooks: the influence of deterministic thinking. Bosch Ed. Ball, D. Content knowledge for teaching. Journal of Teacher Education, 59 5 , — Batanero, C. Teaching and learning probability. Lerman Ed. Heidelberg: Springer.

Google Scholar. Understanding randomness: Challenges for research and teaching. Plenary lecture. Ninth European Conference of Mathematics Education. Prague, Czech Republic. Sriraman Eds. New York: Springer. Meaning and understanding of mathematics. The case of probability. Training teachers to teach probability.

Journal of Statistics Education, The nature of chance and probability. Jones Ed. The meaning of randomness for secondary school students. Journal for Research in Mathematics Education, 30 5 , — Bennett, D. Ben-Zvi, D. The challenge of developing statistical literacy, reasoning and thinking. Dordrecht, The Netherlands: Kluwer. Bernoulli, J. Original work published in Biehler, R. Computers in probability education. Borovcnik Eds. Borovcnik, M. Probabilistic and statistical thinking, In M. Borovcnick, M.

Strengthening the role of probability within statistics curricula. Batanero, G. Reading Eds. Multiple perspectives on the concept of conditional probability. Empirical research in understanding probability. A historical and philosophical perspective on probability. Sriraman, Eds. From puzzles and paradoxes to concepts in probability. Probabilistic language in Spanish textbooks. Phillips Ed. Cardano, G. The book on games of chances.

Carnap, R. Logical foundations of probability. Chicago: University of Chicago Press. Chaput, B. Frequentist approach: Modelling and simulation in statistics and probability teaching. Chernoff, E. Sample space partitions: An investigative lens. Journal of Mathematical Behavior, 28 1 , 19— Logically fallacious relative likelihood comparisons: the fallacy of composition. Experiments in Education, 40 4 , 77— Recognizing revisitation of the representativeness heuristic: an analysis of answer key attributes.

Commentary on probabilistic thinking: presenting plural perspectives. From personal to conventional probabilities: from sample set to sample space. Educational Studies in Mathematics, 77 1 , 15— Chick, H. Teaching statistics at the primary school level: beliefs, affordances, and pedagogical content knowledge. Burrill, C. Rossman Eds. Chung, K. A course in probability theory. London: Academic Press. Common Core State Standards for Mathematics. David, F. Games, gods and gambling.

London: Griffin. Rivista Italiana di Statistica, Economia e Finanza, 5 , — The doctrine of chances. New York: Chelsea Original work published in Drier, H. Fernandez Ed. Dugdale, S. Computers in the Schools, 17 1—2 , — Eichler, A. Three approaches for modelling situations with randomness. Engel, J. Zentralblatt Didaktik der Mathematik, 37 3 , — Modelling scatterplot data and the signal-noise metaphor: Towards statistical literacy for pre-service teachers.

Falk, R. The perception of randomness. Proceedings of the fifth conference of the International Group for the Psychology of Mathematics Education pp. Grenoble, France: University of Grenoble. The psychology of learning probability. Gordon Eds. Washington: Mathematical Association of America. Making sense of randomness: Implicit encoding as a basis for judgement. Psychological Review, 2 , — Fine, T. In statistical activities, facts are collected from respondents for purposes of getting aggregate information, but confidentiality should be protected.

Mention that the agencies mandated to collect data is bound by law to protect the confidentiality of information provided by respondents. Even market research organizations in the private sector and individual researchers also guard confidentiality as they merely want to obtain aggregate data. This way, respondents can be truthful in giving information, and the researcher can give a commitment to respondents that the data they provide will never be released to anyone in a form that will identify them without their consent.

Performing a Data Collection Activity Explain to the students that the purpose of this data collection activity is to gather data that they could use for their future lessons in Statistics. It is important that they do provide the needed information to the best of their knowledge. The following are suggested clarifications to make for each item: 1. This number excludes him or her in the count. Note that the weight has to be reported in kilograms. Note that the height has to be reported in centimeters. Note also that a zero value is not an acceptable value. Note that the student could only choose one.

Note that the time is to be reported using the military way of reporting the time or the hour clock to are the possible values to use Code 1 refers to the feeling that the student is very unhappy while Code 10 refers to a feeling that the student is very happy on the day when the data are being collected. After the clarification, the students are provided at most 10 minutes to respond to the questionnaire. Ask the students to submit the completed SIS so that you could consolidate the data gathered using a formatted worksheet file provided to you as Attachment C. Having the Be sure that the students provided the information in all items in the SIS.

Inform the students that you are to compile all their responses and compiling all these records from everyone in the class is an example of a census since data has been gathered from every student in class. Mention that the government, through the Philippine Statistics Authority PSA , conducts censuses to obtain information about socio-demographic characteristics of the residents of the country. Census data are used by the government to make plans, such as how many schools and hospitals to build. Censuses of population and housing are conducted every 10 years on years ending in zero e.

Mid-decade population censuses have also been conducted since Censuses of Agriculture, and of Philippine Business and Industry, are also conducted by the PSA to obtain information on production and other relevant economic information.

- Solutions for Teaching and Learning.
- Blessure ouverte cuba les temps perdus (Lettres des Caraïbes) (French Edition).
- Black Beauty : The Autobiography of a Horse (Illustrated).
- Read this book.
- Looking for other ways to read this?.
- ICOTS 10 2018.

PSA is the government agency mandated to conduct censuses and surveys. Present to the student the following collection of numbers, figures, symbols, and words, and ask them if they could consider the collection as data. Tell the students that data are facts and figures that are presented, collected and analyzed. Data are either numeric or non-numeric and must be contextualized.

Who provided the data? What are the information from the respondents and What is the unit of measurement used for each of the information if there are any? When was the data collected? Where was the data collected? Why was the data collected? HoW was the data collected? Let us take as an illustration the data that you have just collected from the students, and let us put meaning or contextualize it by responding to the questions with the Ws.

It is recommended that the students answer theW-questions so that they will learn how to do it.

Once the data are contextualized, there is now meaning to the collection of number and symbols which may now look like the following which is just a small part of the data collected in the earlier activity. Your teacher is available to respond to your queries regarding the items in this information sheet, if you have any.

Rest assured that the information that you will be providing will only be used in our lessons in Statistics and Probability. These basic terms include the universe, variable, population and sample. In detail we will discuss other concepts in relation to a variable. Definition of Basic Terms in Statistics universe, variable, population and sample 3. Another Ws of the data is What? What are the information from the respondents? Definition of Basic Terms The collection of respondents from whom one obtain the data is called the universe of the study.

In our illustration, the set of students of this Statistics and Probability class is our universe. But we must precaution the students that a universe is not necessarily composed of people. Since there are studies where the observations were taken from plants or animals or even from non-living things like buildings, vehicles, farms, etc. So formally, we define universe as the collection or set of units or entities from whom we got the data.

Thus, this set of units answers the first Ws of data contextualization. On the other hand, the information we asked from the students are referred to as the variables of the study and in the data collection activity, we have 12 variables including Class Student Number. A variable is a characteristic that is observable or measurable in every unit of the universe. Since these characteristics are observable in each and every student of the class, then these are referred to as variables. The set of all possible values of a variable is referred to as a population.

Thus for each variable we observed, we have a population of values. The number of population in a study will be equal to the number of variables observed. In the data collection activity we had, there are 12 populations corresponding to 12 variables. A subgroup of a universe or of a population is a sample. There are several ways to take a sample from a universe or a population and the way we draw the sample dictates the kind of analysis we do with our data. Broad Classification of Variables Following up with the concept of variable, inform the students that usually, a variable takes on several values.

But occasionally, a variable can only assume one value, then it is called a constant. For instance, in a class of fifteen-year olds, the age in years of students is constant. Variables can be broadly classified as either quantitative or qualitative, with the latter further classified into discrete and continuous types see Figure 3. Qualitative variables do not strictly take on numeric values although we can have numeric codes for them, e. Data on sex or religion do not have the sense of ordering, as there is no such thing as a weaker or stronger sex, and a better or worse religion.

Qualitative variables are sometimes referred to as categorical variables. Quantitative variables have actual units of measure. Quantitative data may be further classified into: a. Discrete data are those data that can be counted, e. These data assume only a finite or infinitely countable number of values. Continuous data are those that can be measured, e.

The possible values are uncountably infinite. With this classification, let us then test the understanding of our students by asking them to classify the variables, we had in our last data gathering activity. They should be able to classify these variables as to qualitative or quantitative and further more as to discrete or continuous. Special Note: For quantitative data, arithmetical operations have some physical interpretation. One can add and if these have quantitative meanings, but if, these numbers refer to room numbers, then adding these numbers does not make any sense.

Even though a variable may take numerical values, it does not make the corresponding variable quantitative! The issue is whether performing arithmetical operations on these data would make any sense. It would certainly not make sense to sum two zip codes or multiply two room numbers. The Manga Guide to Statistics. Trend-Pro Co. A market researcher company requested all teachers of a particular school to fill up a questionnaire in relation to their product market study. If we are to consider the collection of information gathered through the completed questionnaire, what is the universe for this data set?

The universe is the set of all teachers in that school b. Which of the variables are qualitative? Which are quantitative? Among the quantitative variables, classify them further as discrete or continuous. Give at least two populations that could be observed from the variables identified in b. The Engineering Department of a big city did a listing of all buildings in their locality. If you are planning to gather the characteristics of these buildings, a. Set of all buildings in the big city b. It would also be better if you could classify the variables as to whether it is qualitative or quantitative.

Furthermore, classify the quantitative variable as discrete or continuous. A possible answer is the number of floors in the building, quantitative, discrete A survey of students in a certain school is conducted. The survey questionnaire details the information on the following variables.

## Research on Teaching and Learning Probability

For each of these variables, identify whether the variable is qualitative or quantitative, and if the latter, state whether it is discrete or continuous. Knowing such will enable us to plan the data collection process we need to employ in order to gather the appropriate data for analysis.

Motivational Activity 2. Levels of Measurement 3. Then challenge the students to apply a statistical process to investigate on the validity of this statement. You could enumerate on the board the steps in the process to undertake like the following: 1. Plan or design the collection of data to verify the validity of the statement in a way that maximizes information content and minimizes bias; 2.

Collect the data as required in the plan; 3. Verify the quality of the data after it was collected; 4. Summarize the information extracted from the data; and 5. Examine the summary statistics so that insight and meaningful information can be produced to support your decision whether to believe or not the given statement. Let us discuss in detail the first step. In planning or designing the data collection activity, we could consider the set of all the students in the class as our universe.

Then let us identify the variables we need to observe or measure to verify the validity of the statement. You may ask the students to participate in the discussion by asking them to identify a question to get the needed data. The following are some possible suggested queries: 1. Do you usually have a breakfast before going to school? Note: This is answerable by Yes or No 2.

What do you usually have for breakfast? Note: Possible responses for this question are rice, bread, banana, oatmeal, cereal, etc The responses in Questions Numbers 1 and 2 could lead us to identify whether a student in the class had a healthy breakfast, an unhealthy breakfast or no breakfast at all. Furthermore, there is a need to determine the performance of the student in a quiz on that day.

As we describe the data collection process to verify the validity of the statement, there is also a need to include the levels of measurement for the variables of interest. Main Lesson: 1. Levels of Measurement Inform students that there are four levels of measurement of variables: nominal, ordinal, interval and ratio. These are hierarchical in nature and are described as follows: Nominal level of measurement arises when we have variables that are categorical and non- numeric or where the numbers have no sense of ordering.

As an example, consider the numbers on the uniforms of basketball players. Is the player wearing a number 7 a worse player than the player wearing number 10? Maybe, or maybe not, but the number on the uniform does not have anything to do with their performance. The numbers on the uniform merely help identify the basketball player. Other examples of the variables measured at the nominal level include sex, marital status, religious affiliation.

For the study on the validity of the statement regarding effect of breakfast on school performance, students who responded Yes to Question Number 1 can be coded 1 while those who responded No, code 0 can be assigned. The numbers used are simply for numerical codes, and cannot be used for ordering and any mathematical computation. Ordinal level also deals with categorical variables like the nominal level, but in this level ordering is important, that is the values of the variable could be ranked. For the study on the validity of the statement regarding effect of breakfast on school performance, students who had healthy breakfast can be coded 1, those who had unhealthy breakfast as 2 while those who had no breakfast at all as 3.

Using the codes the responses could be ranked. Thus, the students who had a healthy breakfast are ranked first while those who had no breakfast at all are ranked last in terms of having a healthy breakfast. The numerical codes here have a meaningful sense of ordering, unlike basketball player uniforms, the numerical codes suggest that one student is having a healthier breakfast than another student. Other examples of the ordinal scale include socio economic status A to E, where A is wealthy, E is poor , difficulty Note to Teacher: Let us also emphasize to the students that while there is a sense or ordering, there is no zero point in an ordinal scale.

In a scale from 1 to 10, the difference between 7 and 8 may not be the same difference between 1 and 2. Interval level tells us that one unit differs by a certain amount of degree from another unit. Knowing how much one unit differs from another is an additional property of the interval level on top of having the properties posses by the ordinal level.

When measuring temperature in Celsius, a 10 degree difference has the same meaning anywhere along the scale — the difference between 10 and 20 degree Celsius is the same as between 80 and 90 centigrade. But, we cannot say that 80 degrees Celsius is twice as hot as 40 degrees Celsius since there is no true zero, but only an arbitrary zero point. A measurement of 0 degrees Celsius does not reflect a true "lack of temperature. Other example of a variable measure at the interval is the Intelligence Quotient IQ of a person.

We can tell not only which person ranks higher in IQ but also how much higher he or she ranks with another, but zero IQ does not mean no intelligence. The students could also be classified or categorized according to their IQ level. Hence, the IQ as measured in the interval level has also the properties of those measured in the ordinal as well as those in the nominal level.

Special Note: Inform also the students that the interval level allows addition and subtraction operations, but it does not possess an absolute zero. Zero is arbitrary as it does not mean the value does not exist. Zero only represents an additional measurement point. Ratio level also tells us that one unit has so many times as much of the property as does another unit. The ratio level possesses a meaningful unique and non-arbitrary absolute, fixed zero point and allows all arithmetic operations.

The existence of the zero point is the only difference between ratio and interval level of measurement. Examples of the ratio scale include mass, heights, weights, energy and electric charge. With mass as an example, the difference between grams and grams is 15 grams, and this is the same difference between grams and grams. The level at any given point is constant, and a measurement of 0 reflects a complete lack of mass.

Amount of money is also at the ratio level. We can say that pesos is twice more than 1, pesos. In addition, money has a true zero point: if you have zero money, this implies the absence of money. A score of zero implies that the student did not get a correct answer at all. In summary, we have the following levels of measurement: Level Property Basic Empirical Operation Nominal No order, distance, or origin Determination of equivalence Ordinal Has order but no distance or unique origin Determination of greater or lesser values Interval Both with order and distance but no unique origin Determination of equality of intervals or difference Ratio Has order, distance and unique origin Determination of equality of ratios or means The levels of measurement depend mainly on the method of measurement, not on the property measured.

The weight of primary school students measured in kilograms has a ratio level, but the students can be categorized into overweight, normal, underweight, and in which case, the weight is then measured in an ordinal level. Also, many levels are only interval because their zero point is arbitrarily chosen. To assess the students understanding of the lesson, you may go back to the set of variables in the data gathering activity done in Lesson 2.

You could ask the students to identify the level of measurement for each of the variable. Methods of Data Collection Variables were observed or measured using any of the three methods of data collection, namely: objective, subjective and use of existing records. The objective and subjective methods obtained the data directly from the source.

The former uses any or combination of the five senses sense of sight, touch, hearing, taste and smell to measure the variable while the latter obtains data by getting responses through a questionnaire. The resulting data from these two methods of data collection is referred to as primary data.

The data gathered in Lesson 2 are primary data and were obtained using the subjective method. On the other hand, secondary data are obtained through the use of existing records or data collected by other entities for certain purposes. For example, when we use data gathered by the Philippine Statistics Authority, we are using secondary data and the method we employ to get the data is the use of existing records.

Other data sources include administrative records, news articles, internet, and the like. However, we must emphasize to the students that when we use existing data we must be confident of the quality of the data we are using by knowing how the data were gathered. Also, we must remember to request permission and acknowledge the source of the data when using data gathered by other agency or people.

Using the data of the teachers in a particular school gathered by a market researcher company, identify the level of measurement for each of the following variable. The following variables are included in a survey conducted among students in a certain school. Identify the level of measurement for each of the variables.

In the following, identify the data collection method used and the type of resulting data. The website of Philippine Airlines provides a questionnaire instrument that can be answered electronically. A reporter recorded the number of minutes to travel from one end to another of the Metro Manila Rail Transit MRT during peak and off-peak hours. Students getting the height of the plants using a meter stick. PSA enumerator conducting the Labor Force Survey goes around the country to interview household head on employment-related variables. Additional concepts could help the students to appropriately describe further the data set.

Review of Lessons in Data Presentation taken up from Grade 1 to Methods of Data Presentation 3. You could assist the students to recall what they have learned in Grade 1 to 10 regarding data presentation by asking them to participate in an activity. This is actually a review and wake-up exercise. You may list on the board their responses. You could summarize their responses to be able to establish what they already know about data presentation techniques and from this you could build other concepts on the topic.

A suggestion is to classify their answers according to the three methods of data presentation, i. Methods of Data Presentation You could inform the students that in general there are three methods to present data. Two or all of these three methods could be used at the same time to present appropriately the information from the data set. These methods include the 1 textual or narrative; 2 tabular; and 3 graphical method of presentation.

In presenting the data in textual or paragraph or narrative form, one describes the data by enumerating some of the highlights of the data set like giving the highest, lowest or the average values. In case there are only few observations, say less than ten observations, the values could be enumerated if there is a need to do so. The region with the smallest estimated poverty incidence among families at 2. Data could also be summarized or presented using tables. The tabular method of presentation is applicable for large data sets. Trends could easily be seen in this kind of presentation.

However, there is a loss of information when using such kind of presentation. The frequency distribution table is the usual tabular form of presenting the distribution of the data. The following are the common parts of a statistical table: a. Table title includes the number and a short description of what is found inside the table. Column header provides the label of what is being presented in a column. Row header provides the label of what is being presented in a row. Body are the information in the cell intersecting the row and the column. However, too many information to convey in a table is also not advisable.

Tables are usually used in written technical reports and in oral presentation. Table 5. This example was taken from Philippine Statistics in Brief, a regular publication of the PSA which is also the basis for the example of the textual presentation given above. Region NCR 2. Graphs are commonly used in oral presentation. There are several forms of graphs to use like the pie chart, pictograph, bar graph, line graph, histogram and box-plot. Which form to use depends on what information is to be relayed. For example, trends across time are easily seen using a line graph.

However, values of variables in nominal or ordinal levels of measurement should not be presented using line graph. Rather a bar graph is more appropriate to use. A graphical presentation in the form of vertical bar graph of the regional estimates of poverty incidence among families is shown below: Figure 5. Poverty"Incidence"Among" Families"in"Percent" Other examples of graphical presentations that are shown below are lifted from the Handbook of Statistics 1 listed in the reference section at the end of this Teaching Guide. Figure 5. Percentage distribution of dogs according to groupings identified in a dog show.

Distribution of fruits sales of a store for two days. Height and weight of STAT 1 students registered during the previous term. The Frequency Distribution Table and Histogram A special type of tabular and graphical presentation is the frequency distribution table FDT and its corresponding histogram. Specifically, these are used to depict the distribution of the data.

Most of the time, these are used in technical reports. An FDT is a presentation containing non-overlapping categories or classes of a variable and the frequencies or counts of the observations falling into the categories or classes. For a qualitative FDT, the non-overlapping categories of the variable are identified, and frequencies, as well as the percentages of observations falling into the categories, are computed. On the other hand, for a quantitative FDT, there are also of two types: ungrouped and grouped. Ungrouped FDT is constructed when there are only a few observations or if the data set contains only few possible values.

On the other hand, grouped FDT is constructed when there is a large number of observations and when the data set involves many possible values. The distinct values are grouped into class intervals. The creation of columns for a grouped FDT follows a set of guidelines. One such procedure is described in the following steps, which is lifted from the Workbook in Statistics 1 listed in the reference section at the end of this Teaching Guide Steps in the construction of a grouped FDT 1.

Identify the largest data value or the maximum MAX and smallest data value or the minimum MIN from the data set and compute the range, R. The range is the difference between the largest and smallest value, i. Round-off k to the nearest whole number. It should be noted that the computed k might not be equal to the actual number of classes constructed in an FDT. Round off c to the nearest value with precision the same as that with the raw data. Construct the classes or the class intervals.

A class interval is defined by a lower limit LL and an upper limit UL. The UL of the lowest class is obtained by subtracting one unit of measure 1 10x! Tally the data into the classes constructed in Step 4 to obtain the frequency of each class. Each observation must fall in one and only one class. Add if needed the following distributional characteristics: a. The TCBs reflect the continuous property of a continuous data. Class Mark CM. Relative Frequency RF. The RF refers to the frequency of the class as a fraction of the total frequency, i.

RF can be computed for both qualitative and quantitative data. RF can also be expressed in percent.

## Learning Path

Cumulative Frequency CF. The histogram is a graphical presentation of the frequency distribution table in the form of a vertical bar graph. There are several forms of the histogram and the most common form has the frequency on its vertical axis while the true class boundaries in the horizontal axis. As an example, the FDT and its corresponding histogram of the estimated poverty incidences of municipalities and cities of Region VIII are shown below.