# Answer to Question #6270 in Statistics and Probability for jineen

Question #6270

A file clerk is assigned the task of selecting a random sample of 26 company accounts (from a total of 5,000) to be audited. The clerk is considering two sampling methods: Method A- Organize the 5,000 company accounts in alphabetical order (according to the first letter of the clients last name). Then randomly select one account card for each of the 26 letters of the alphabet. Method B- assign each company account a 4 digit number from 0001 to 5000. using a computer random number generator, choose 26 4-digit numbers (from 0001 to 5000) and match the numbers with the corresponding company account. Which of the two methods would you recommend to the file clerk? which sampling method could possible yield a biased sample? why?

Expert's answer

I would recommend method B because it would better reflect a random sample. The

observations in that case would be relatively independent. However, we still

won't get clean statistical data because computer generates pseudo-random

numbers, not random numbers.

Clerk shouldn't use method A because he can get

a biased sample in this case. In real life situations client's last names are

not uniformly distributed along the alphabet. That's why among 5,000 companies

we can get 200 companies with letter 'N' and only 1 company with letter 'Y'. So,

when the clerk selects one company for each letter, he will always have to

choose 1 company from 200 for 'N' but the same company for 'Y' every time. Such

statistical data cannot be used for analysis.

observations in that case would be relatively independent. However, we still

won't get clean statistical data because computer generates pseudo-random

numbers, not random numbers.

Clerk shouldn't use method A because he can get

a biased sample in this case. In real life situations client's last names are

not uniformly distributed along the alphabet. That's why among 5,000 companies

we can get 200 companies with letter 'N' and only 1 company with letter 'Y'. So,

when the clerk selects one company for each letter, he will always have to

choose 1 company from 200 for 'N' but the same company for 'Y' every time. Such

statistical data cannot be used for analysis.

## Comments

## Leave a comment