Friday, March 6, 2026

#2 Chi-Square

Chi-Square Test in Civil Engineering

The Chi-Square (χ²) Test

Validating Engineering Models: Goodness-of-Fit and Independence

1. The Engineering Concept

While the Normal Distribution tells us what the data should look like, the Chi-Square Test checks if our real-world observations actually match that theory. It answers the question: "Is the difference between what I saw and what I expected just a random fluke, or is my model wrong?"

In Civil Engineering, we use it to verify if rainfall follows a specific distribution or if the failure rate of a material is truly independent of the supplier.

2. The Formula

χ² = Σ [ (Oᵢ - Eᵢ)² / Eᵢ ]

Oᵢ = Observed Frequency | Eᵢ = Expected Frequency
Degrees of Freedom (df): (Rows - 1) × (Cols - 1) or (n - 1)

📊 Choosing Your Tool: Normal vs. Chi-Square

In the field, you’ll have data, but which formula do you grab? Use this table to decide:

Feature Normal Distribution (Z-Test) Chi-Square Test (χ²)
Data Type Continuous (Measurements like MPa, mm, km/h) Categorical (Counts/Frequencies like Pass/Fail, Soil Type A/B)
Primary Goal To find the Probability of a specific value occurring. To check the Relationship or "Goodness of Fit."
Question Answered "What is the chance this beam fails?" "Does the supplier affect the failure rate?"
Key Parameters Mean (μ) and Std Dev (σ) Observed (O) and Expected (E) counts
Example Case Testing if a specific concrete cube hits 30MPa. Testing if 100 cubes follow a 1:2:1 strength ratio.
💡 Tutor Tip: If you are measuring "How much," use Normal. If you are counting "How many," use Chi-Square.

3. 10 Worked Engineering Examples

1. Cement Bag Weights (Goodness of Fit)

A machine is set to pack bags such that 20% are 49kg, 70% are 50kg, and 10% are 51kg. A sample of 100 bags shows 15, 75, and 10 bags respectively. Test at α = 0.05.

Expected: 20, 70, 10.
χ² = (15-20)²/20 + (75-70)²/70 + (10-10)²/10 = 1.25 + 0.357 + 0 = 1.607.
Critical value (df=2): 5.99. Result: Match is Good (Fail to reject).

2. Concrete Supplier vs. Strength (Independence)

Does the concrete strength (Pass/Fail) depend on the supplier (A vs B)?
Observed: A(40 Pass, 10 Fail), B(30 Pass, 20 Fail).

Total Pass = 70, Total Fail = 30. Total N = 100.
Exp A-Pass = (50*70)/100 = 35.
χ² = (40-35)²/35 + (10-15)²/15 + (30-35)²/35 + (20-15)²/15 = 4.76.
Crit (df=1): 3.84. Result: Strength depends on Supplier.

3. Traffic Accidents by Day

Accidents: Mon(12), Tue(8), Wed(15), Thu(10), Fri(20). Total = 65. Is it uniform?

Exp = 65/5 = 13 per day.
χ² = Σ(O-13)²/13 = (1+25+4+9+49)/13 = 6.77.
Crit (df=4): 9.49. Result: Accidents are distributed uniformly.

4. Rainfall Distribution Fit

Testing if 100 years of flood data fits a Normal Dist. Observed in 4 quartiles: 30, 22, 28, 20.

Exp = 100/4 = 25 per quartile.
χ² = (5²/25) + (-3²/25) + (3²/25) + (-5²/25) = 1+0.36+0.36+1 = 2.72.
Crit (df=3): 7.82. Result: Data fits the Normal Distribution.

5. Brick Type vs. Cracking

Fly-ash vs Clay bricks. Cracks observed: Fly-ash(5/50), Clay(15/50).

χ² = 5.55. Crit (df=1): 3.84.
Result: Brick type significantly affects cracking rate.

6. Pavement Distress Levels

Testing if Low/Med/High distress follows a 1:2:1 ratio. Observed 100 sections: 30 Low, 45 Med, 25 High.

Exp: 25, 50, 25.
χ² = (5²/25) + (-5²/50) + (0²/25) = 1 + 0.5 + 0 = 1.5.
Crit (df=2): 5.99. Result: Pavement follows the expected ratio.

7. Soil Type vs. Foundation Settlement

Does Settlement (High/Low) depend on Soil (Sandy/Clay)? Sandy: 10 High, 40 Low. Clay: 25 High, 25 Low.

Calculated χ² = 9.52. Crit (df=1): 3.84.
Result: Settlement is highly dependent on soil type.

8. Equipment Breakdown Frequency

3 Cranes. Breakdowns: C1(5), C2(12), C3(7). Is one crane worse?

Total = 24. Exp = 8.
χ² = (5-8)²/8 + (12-8)²/8 + (7-8)²/8 = 1.125 + 2.0 + 0.125 = 3.25.
Crit (df=2): 5.99. Result: No significant difference between cranes.

9. Surveying Error Source

Errors attributed to: Human(40), Instrument(30), Environment(30). Expected: 33.3% each.

Exp = 33.3. χ² = 4.44/33.3 + 11.1/33.3 + 11.1/33.3 = 2.0.
Crit (df=2): 5.99. Result: Error sources are equally likely.

10. Steel Grade vs Corrosion

Grade 1: 5 corroded/50. Grade 2: 12 corroded/50. Test for difference.

Calculated χ² = 3.42. Crit (df=1): 3.84.
Result: At 5% level, no significant difference (barely passed).

⚠️ Common Pitfalls

  • Small Sample Size: Never use Chi-Square if any "Expected" value is less than 5. Use Fisher’s Exact Test instead.
  • Using Percentages: Always use raw counts (frequencies). χ² doesn't work on percentages or means.
  • Degrees of Freedom: Students often use n instead of n-1. In a 2x2 table, df is always 1!

© 2024 Civil Stats Blog | Helping Engineers Make Data-Driven Decisions

No comments:

Post a Comment

#6a Advance Hypothesis Testing

Advanced Hypothesis Testing - Civil Engineering Statistics 🎓 Advanced Hypothesis Testing ...