The Chi-Square (χ²) Test
Validating Engineering Models: Goodness-of-Fit and Independence
1. The Engineering Concept
While the Normal Distribution tells us what the data should look like, the Chi-Square Test checks if our real-world observations actually match that theory. It answers the question: "Is the difference between what I saw and what I expected just a random fluke, or is my model wrong?"
In Civil Engineering, we use it to verify if rainfall follows a specific distribution or if the failure rate of a material is truly independent of the supplier.
2. The Formula
Oᵢ = Observed Frequency | Eᵢ = Expected Frequency
Degrees of Freedom (df): (Rows - 1) × (Cols - 1) or (n - 1)
📊 Choosing Your Tool: Normal vs. Chi-Square
In the field, you’ll have data, but which formula do you grab? Use this table to decide:
| Feature | Normal Distribution (Z-Test) | Chi-Square Test (χ²) |
|---|---|---|
| Data Type | Continuous (Measurements like MPa, mm, km/h) | Categorical (Counts/Frequencies like Pass/Fail, Soil Type A/B) |
| Primary Goal | To find the Probability of a specific value occurring. | To check the Relationship or "Goodness of Fit." |
| Question Answered | "What is the chance this beam fails?" | "Does the supplier affect the failure rate?" |
| Key Parameters | Mean (μ) and Std Dev (σ) | Observed (O) and Expected (E) counts |
| Example Case | Testing if a specific concrete cube hits 30MPa. | Testing if 100 cubes follow a 1:2:1 strength ratio. |
3. 10 Worked Engineering Examples
1. Cement Bag Weights (Goodness of Fit)
A machine is set to pack bags such that 20% are 49kg, 70% are 50kg, and 10% are 51kg. A sample of 100 bags shows 15, 75, and 10 bags respectively. Test at α = 0.05.
χ² = (15-20)²/20 + (75-70)²/70 + (10-10)²/10 = 1.25 + 0.357 + 0 = 1.607.
Critical value (df=2): 5.99. Result: Match is Good (Fail to reject).
2. Concrete Supplier vs. Strength (Independence)
Does the concrete strength (Pass/Fail) depend on the supplier (A vs B)?
Observed: A(40 Pass, 10 Fail), B(30 Pass, 20 Fail).
Exp A-Pass = (50*70)/100 = 35.
χ² = (40-35)²/35 + (10-15)²/15 + (30-35)²/35 + (20-15)²/15 = 4.76.
Crit (df=1): 3.84. Result: Strength depends on Supplier.
3. Traffic Accidents by Day
Accidents: Mon(12), Tue(8), Wed(15), Thu(10), Fri(20). Total = 65. Is it uniform?
χ² = Σ(O-13)²/13 = (1+25+4+9+49)/13 = 6.77.
Crit (df=4): 9.49. Result: Accidents are distributed uniformly.
4. Rainfall Distribution Fit
Testing if 100 years of flood data fits a Normal Dist. Observed in 4 quartiles: 30, 22, 28, 20.
χ² = (5²/25) + (-3²/25) + (3²/25) + (-5²/25) = 1+0.36+0.36+1 = 2.72.
Crit (df=3): 7.82. Result: Data fits the Normal Distribution.
5. Brick Type vs. Cracking
Fly-ash vs Clay bricks. Cracks observed: Fly-ash(5/50), Clay(15/50).
Result: Brick type significantly affects cracking rate.
6. Pavement Distress Levels
Testing if Low/Med/High distress follows a 1:2:1 ratio. Observed 100 sections: 30 Low, 45 Med, 25 High.
χ² = (5²/25) + (-5²/50) + (0²/25) = 1 + 0.5 + 0 = 1.5.
Crit (df=2): 5.99. Result: Pavement follows the expected ratio.
7. Soil Type vs. Foundation Settlement
Does Settlement (High/Low) depend on Soil (Sandy/Clay)? Sandy: 10 High, 40 Low. Clay: 25 High, 25 Low.
Result: Settlement is highly dependent on soil type.
8. Equipment Breakdown Frequency
3 Cranes. Breakdowns: C1(5), C2(12), C3(7). Is one crane worse?
χ² = (5-8)²/8 + (12-8)²/8 + (7-8)²/8 = 1.125 + 2.0 + 0.125 = 3.25.
Crit (df=2): 5.99. Result: No significant difference between cranes.
9. Surveying Error Source
Errors attributed to: Human(40), Instrument(30), Environment(30). Expected: 33.3% each.
Crit (df=2): 5.99. Result: Error sources are equally likely.
10. Steel Grade vs Corrosion
Grade 1: 5 corroded/50. Grade 2: 12 corroded/50. Test for difference.
Result: At 5% level, no significant difference (barely passed).
⚠️ Common Pitfalls
- Small Sample Size: Never use Chi-Square if any "Expected" value is less than 5. Use Fisher’s Exact Test instead.
- Using Percentages: Always use raw counts (frequencies). χ² doesn't work on percentages or means.
- Degrees of Freedom: Students often use n instead of n-1. In a 2x2 table, df is always 1!
No comments:
Post a Comment