|
Discovering the Normal Distribution with New DIKWP
Yucong Duan
International Standardization Committee of Networked DIKWP for Artificial Intelligence Evaluation(DIKWP-SC)
World Artificial Consciousness CIC(WAC)
World Conference on Artificial Consciousness(WCAC)
(Email: duanyucong@hotmail.com)
IntroductionFrom the earliest days of my life, I was surrounded by a world rich in patterns and variations. I observed the heights of plants in a garden, the sizes of pebbles on a path, and the durations of bird songs. Although individual instances varied, there seemed to be an underlying order to these variations. Driven by curiosity, I sought to understand how these natural fluctuations could be quantified and predicted.
In this narrative, I will detail how, starting from basic experiences as an infant, I independently observed, experimented, and logically deduced the concept of the Normal Distribution. Using the DIKWP Semantic Mathematics framework—which stands for Data, Information, Knowledge, Wisdom, and Philosophy(Instead of Purpose)—I evolved each concept explicitly from my experiences, ensuring that my understanding is grounded in reality and free from subjective definitions.
Chapter 1: Gathering Data from the Environment1.1 Observing Variations in NatureCollecting DataActivity: I began by collecting pebbles from a garden path.
Observation: The pebbles varied in size, color, and shape.
Data: I measured the sizes using a simple scale (e.g., small, medium, large).
Observation: Most pebbles were of medium size, with fewer small and large ones.
Reflection: There is a tendency for certain sizes to be more common than others.
Method: I sorted pebbles into size categories and counted them.
Data Table:
Size | Count |
---|---|
Small | 10 |
Medium | 50 |
Large | 10 |
Tool: Constructed a simple bar chart to represent the counts.
Observation: The chart has a peak at the medium size, forming a symmetric pattern.
Concept: Frequency refers to how often a particular value occurs.
Calculation: Calculated the relative frequency by dividing the count by the total number of pebbles.
Table:
Size | Count | Relative Frequency |
---|---|---|
Small | 10 | 0.16 |
Medium | 50 | 0.68 |
Large | 10 | 0.16 |
Observation: Medium-sized pebbles are the most frequent.
Semantics: The data provides information about the distribution of pebble sizes.
Tool: Created a histogram with pebble sizes on the x-axis and frequency on the y-axis.
Observation: The histogram resembles a bell-shaped curve.
Observation: The distribution is symmetric around the medium size.
Reflection: This pattern may be common in other natural phenomena.
Data Collection: Measured the heights of plants in a garden.
Observation: Heights varied, with most plants around a certain average height.
Observation: The histogram of plant heights also formed a bell-shaped curve.
Reflection: Different datasets exhibit similar distribution patterns.
Symmetry: Both datasets are symmetric around the mean.
Peak Frequency: Highest frequency occurs at the mean value.
Tails: Frequencies decrease as values move away from the mean.
Hypothesis: Many natural phenomena follow a bell-shaped distribution pattern.
Concept: The mean is the average value of a dataset.
Calculation: Sum of all values divided by the number of observations.
Assigning Numerical Values:
Small = 1, Medium = 2, Large = 3
Calculation:
Mean = (1×10)+(2×50)+(3×10)70=10+100+3070=14070=2\frac{(1 \times 10) + (2 \times 50) + (3 \times 10)}{70} = \frac{10 + 100 + 30}{70} = \frac{140}{70} = 270(1×10)+(2×50)+(3×10)=7010+100+30=70140=2
Concept: Measures the spread of data around the mean.
Calculation: Square root of the average of the squared differences from the Mean.
Find the Deviations:
(Value−Mean)(Value - Mean)(Value−Mean)
Square the Deviations:
(Value−Mean)2(Value - Mean)^2(Value−Mean)2
Calculate the Variance:
Variance=∑(Value−Mean)2N\text{Variance} = \frac{\sum (Value - Mean)^2}{N}Variance=N∑(Value−Mean)2
Standard Deviation:
σ=Variance\sigma = \sqrt{\text{Variance}}σ=Variance
Calculations:
Small: (1−2)2=1(1 - 2)^2 = 1(1−2)2=1
Medium: (2−2)2=0(2 - 2)^2 = 0(2−2)2=0
Large: (3−2)2=1(3 - 2)^2 = 1(3−2)2=1
Weighted Sum:
Variance=(1×10)+(0×50)+(1×10)70=10+0+1070=2070≈0.2857\text{Variance} = \frac{(1 \times 10) + (0 \times 50) + (1 \times 10)}{70} = \frac{10 + 0 + 10}{70} = \frac{20}{70} \approx 0.2857Variance=70(1×10)+(0×50)+(1×10)=7010+0+10=7020≈0.2857
Standard Deviation:
σ=0.2857≈0.535\sigma = \sqrt{0.2857} \approx 0.535σ=0.2857≈0.535
Probability Density Function (PDF):
f(x)=1σ2πe−(x−μ)22σ2f(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{ -\frac{(x - \mu)^2}{2\sigma^2} }f(x)=σ2π1e−2σ2(x−μ)2
μ\muμ: Mean of the distribution
σ\sigmaσ: Standard deviation
xxx: Variable
Bell-Shaped Curve
Symmetry around the Mean
68-95-99.7 Rule:
Approximately 68% of data falls within ±1σ\pm 1\sigma±1σ
Approximately 95% within ±2σ\pm 2\sigma±2σ
Approximately 99.7% within ±3σ\pm 3\sigma±3σ
Using the Mean (μ=2\mu = 2μ=2) and Standard Deviation (σ≈0.535\sigma \approx 0.535σ≈0.535)
Plotting the PDF
Observation: The theoretical curve aligns closely with the histogram.
Calculating Expected Frequencies
Comparing with Observed Frequencies
Conclusion: The normal distribution models the data effectively.
Examples:
Heights of individuals in a population
Measurement errors in experiments
Test scores in large groups
The normal distribution appears in various contexts due to the Central Limit Theorem.
Concept: The sum of a large number of independent, identically distributed random variables tends toward a normal distribution, regardless of the original distribution.
Explanation for Universality: Aggregated random processes result in a normal distribution.
Semantics: The normal distribution is a natural outcome of combining random variables.
Observation: While individual events are random, overall patterns exhibit order.
Reflection: There is a philosophical harmony between randomness at the micro level and determinism at the macro level.
Observation: Not all datasets perfectly fit the normal distribution.
Considerations:
Skewed Distributions: When data is not symmetric.
Kurtosis: When data has heavier or lighter tails than the normal distribution.
Understanding Complexity: Real-world phenomena may require more nuanced models.
Embracing Uncertainty: Recognizing the limitations enhances the pursuit of knowledge.
Hypothesis: Any process that aggregates a large number of independent random variables will result in a normal distribution under certain conditions.
Data: Collect measurements from aggregated processes.
Information: Analyze frequency distributions.
Knowledge: Recognize patterns aligning with the normal distribution.
Wisdom: Generalize the observation to formulate the theorem.
Philosophy: Reflect on the implications and limitations.
Collecting Data: Gather datasets from different aggregated processes (e.g., rolling dice, combining measurement errors).
Analysis: Fit the data to the normal distribution.
Validation: Confirm that the theorem holds under specified conditions.
Concept: Standard deviation quantifies the dispersion of data points.
Implication: A larger σ\sigmaσ indicates more spread out data.
Comparing Datasets: Understanding which dataset has more variability.
Quality Control: Monitoring consistency in processes.
Law of Large Numbers: As sample size increases, the sample mean approaches the population mean.
Proof Sketch:
Starting Point: Individual random variables with finite mean and variance.
Summation: Sum of these variables.
Normalization: Adjusting the sum to have a finite variance.
Foundation for Inferential Statistics: Enables confidence intervals and hypothesis testing.
Usage: Modeling errors in instruments to improve accuracy.
Example: Calibrating devices based on expected normal error distribution.
Application: Filtering out noise modeled as normally distributed.
Observation: Test scores often approximate a normal distribution.
Usage: Setting grading curves and percentiles.
Application: Assessing traits like IQ, which are modeled using normal distributions.
Through observation, experimentation, and logical reasoning, I was able to discover and understand the Normal Distribution. Starting from basic experiences with natural variations, I collected data, transformed it into information, developed knowledge, gained wisdom, and reflected philosophically—embodying the DIKWP Semantic Mathematics framework. This journey demonstrates how complex mathematical concepts can emerge naturally from simple observations, without relying on subjective definitions.
The normal distribution is a cornerstone of statistics and probability, providing insights into the patterns underlying random variables. By recognizing its prevalence in natural phenomena, we gain powerful tools for prediction, analysis, and decision-making.
Epilogue: Implications for Learning and AIThis narrative illustrates how foundational mathematical principles can be understood through direct interaction with the environment and logical reasoning. In the context of artificial intelligence and cognitive development, it emphasizes the importance of experiential learning and the structured evolution of concepts from data to philosophy.
By enabling AI systems to:
Collect Data: Observe and gather information from the environment.
Transform Data into Information: Analyze and find patterns.
Develop Knowledge: Formulate generalizations and models.
Gain Wisdom: Understand the implications and applications.
Reflect Philosophically: Consider the broader impact and limitations.
We can foster the development of intuitive understanding similar to human learning. This approach promotes the natural discovery of mathematical relationships without reliance on predefined definitions.
Note: This detailed narrative presents the conceptualization of the Normal Distribution as if I, an infant, independently observed and reasoned it out. Each chapter is explored in full length, emphasizing the natural progression from data collection to philosophical reflection using the DIKWP Semantic Mathematics framework. This approach demonstrates that with curiosity and logical thinking, foundational knowledge about complex mathematical concepts can be accessed and understood without relying on subjective definitions.
References for Further Reading
International Standardization Committee of Networked DIKWP for Artificial Intelligence Evaluation (DIKWP-SC),World Association of Artificial Consciousness(WAC),World Conference on Artificial Consciousness(WCAC). Standardization of DIKWP Semantic Mathematics of International Test and Evaluation Standards for Artificial Intelligence based on Networked Data-Information-Knowledge-Wisdom-Purpose (DIKWP ) Model. October 2024 DOI: 10.13140/RG.2.2.26233.89445 . https://www.researchgate.net/publication/384637381_Standardization_of_DIKWP_Semantic_Mathematics_of_International_Test_and_Evaluation_Standards_for_Artificial_Intelligence_based_on_Networked_Data-Information-Knowledge-Wisdom-Purpose_DIKWP_Model
Duan, Y. (2023). The Paradox of Mathematics in AI Semantics. Proposed by Prof. Yucong Duan:" As Prof. Yucong Duan proposed the Paradox of Mathematics as that current mathematics will not reach the goal of supporting real AI development since it goes with the routine of based on abstraction of real semantics but want to reach the reality of semantics. ".
Archiver|手机版|科学网 ( 京ICP备07017567号-12 )
GMT+8, 2024-10-31 18:04
Powered by ScienceNet.cn
Copyright © 2007- 中国科学报社