The representativeness of the sample should be. Representative sample

Encyclopedia plants 20.09.2019
Encyclopedia plants

The concept of representativeness is often found in statistical advice and in the preparation of speeches and reports. Perhaps it is difficult without it to imagine any of the types of information on the review.

Representativeness - what is it?

Representative reflects how chosen objects or parts correspond to the content and meaning of the set of data from which they were selected.

Other definitions

The concept of representativeness can be disclosed in different contexts. But in its sense, representativeness is the correspondence of the features and properties of the selected units of the total aggregate, which accurately reflect the characteristics of the entire general database as a whole.

Also, the representativeness of the information is determined as the ability of sample data to submit the parameters and properties of the totality, important from the point of view of the study.

Representative sample

The principle of sampling is to elect the most important and accuracy of the properties of the total set of data. To do this is used various methodswhich allow you to obtain accurate results and a general idea of \u200b\u200busing only selective materials that describe the qualities of all data.

Thus, there is no need to study all the material, but it suffices to consider selective representativeness. What is it? This is a selection of individual data in order to have a concept about the total mass of information.

Their depending on the method is distinguished as probabilistic and incredible. Probable is a sample that is made by calculating the most important and interesting data that are in the future representatives of the general population. This is a thoughtful choice or random sample, nevertheless, justified by its content.

Unbelievable - this is one of the varieties of random sample, which is based on the principle of the usual lottery. In this case, the view of who constitutes such a sample is not taken into account. Used only blind lot.

Probabilistic sample

Probabilistic samples can also be divided into several types:

  • One of the most simple and understandable principles is an unrepresentative sample. For example, this method is often used when conducting social surveys. At the same time, the survey participants are not chosen from the crowd for any specific features, and the receipt of information is made in the first 50 people who participated in it.
  • The deliberate samples are distinguished by the fact that they have a number of requirements and conditions during the selection, but still rely on a random coincidence, not to persecute the achievement of good statistics.
  • The sample on the basis of quotas is another of the variations of an incredible sample, which is often used to study large sets of data. It uses many conditions and norms. The objects that must match them are selected. That is, on the example of a social survey, it can be assumed that 100 people will be interviewed, but only the opinion of a certain number of people who will comply with the established requirements will be taken into account when drawing up a statistical report.

Probabilistic samples

For probabilistic samples, a number of parameters are calculated to which objects in the sample will correspond, and among them different ways It is the facts and data that will be presented as the representativeness of these samples may be elected. In such ways to calculate the necessary data may be:

  • Simple random sample. It is that among the selected segment, a completely random method of the lottery is selected required amount Data that will be a representative sample.
  • The systematic and random sample makes it possible to compile the system for calculating the necessary data based on the random segment segment. Thus, if the first random number indicates serial number The data selected from the total aggregate will be 5, then the subsequent data that will be selected may become, for example, 15, 25, 35, and so on. This example clearly explains that even a random choice can be based on the systematic calculations of the necessary source data.

Selection of consumers

A meaningful sample is a method that consists of each individual segment, and on the basis of its assessment, a combination is drawn up, reflecting the characteristics and properties of the common database. Thus, gaining large quantity Data corresponding to the requirements of a representative sample. You can easily select a number of options that will not be included in total numberwithout losing the quality of selected data representing a general aggregate. In this way, the representativeness of the research results is determined.

Sample size

Not the last question that needs to be solved is the size of the sample for the representative presentation of the general population. The sample size does not always depends on the number of sources in the general population. However, the representativeness of the selective aggregate directly depends on how many segments should result in the result. The more such segments, the more data gets into a productive sample. If the results require a general designation and do not require specifics, then, respectively, the sample becomes smaller, because, without going into details, the information is preparing more superficially, which means that its reading will be general.

The concept of a representativeness error

Representative error is the specific discrepancies between the characteristics of the general population and sample data. When conducting any selective study, it is impossible to obtain absolutely accurate data, as with a full study of the general aggregates and samples, represented only by part of information and parameters, whereas more detailed study is possible only in the study of the entire totality. Thus, some errors and errors are inevitable.

Types of mistakes

There are some errors that arise when compiling a representative sample:

  • Systematic.
  • Random.
  • Deliberate.
  • Unintentional.
  • Standard.
  • Limit.

The basis for the appearance of random errors may be the denunar nature of the study of the total aggregate. Typically, a random error of representativeness has a slight size and character.

Systematic errors meanwhile arise in violation of the data selection rules from the total aggregate.

Average error is the difference between the averaged sample values \u200b\u200band the main set. It does not depend on the number of units in the sample. It is inversely proportional to the more than the volume, the less value average error.

The limit error is the highest possible difference between the averaged values \u200b\u200bof the sample made and the overall set. This error is characterized as a maximum of probable errors under the given conditions for their appearance.

Deliberate and unintentional representativeness errors

Data displacement errors are intentional and unintentional.

Then the reasons for the appearance of intentional errors is the approach to the selection of data by the method of determining trends. Unintentional errors occur at the stage of preparation of selective observation, the formation of a representative sample. To prevent such errors, you need to create a good foundation For sampling, which make up lists of selection units. It must fully meet the objectives of the sample, be reliable, covering all aspects of the study.

Validity, reliability, representativeness. Error calculation

Calculation of the error of the representativeness (mm) of the middle arithmetic value (M).

Average quadratic deviation: the size of the sample (\u003e 30).

Representative error (MR) and (P): Number of sampling (n\u003e 30).

In the case when it is necessary to study a set, where the amount of sample has little and is less than 30 units, then the number of observations will become less per unit.

The magnitude of the error is directly generated by the size of the sample. The representativeness of the information and the calculation of the extent is the possibility of drawing up the exact forecast reflects a certain amount of the limit error.

Representative systems

Not only in the process of estimating information feeds, a representative sample is used, but the person who receives information itself uses representative systems. Thus, the brain processes some creating a representative sample from the entire information stream to qualitatively and quickly appreciate the submitted data and understand the essence of the question. Answer the question: "Representativeness - what is it?" - In the scale of human consciousness is quite simple. For this, the brain uses all subjects depending on which information must be pulled out from a total flow. Thus, distinguish:

  • A visual representative system where the bodies of eye perception are involved. People who often use a similar system are called visuals. With this system, man processes information in the form of images.
  • Audial Representative System. The main body that is used is hearing. The information applied in the form of sound files or speech is processed by this particular system. People, better perceive information on rumor, are called audiors.
  • The kinesthetic representative system is the processing of information flow, by perceiving it with the help of olfactory and tactile channels.

  • A digital representative system is used with others as a means of obtaining information from the outside. Perception and understanding of the data obtained.

So, Representativeness - what is it? Simple sample from a variety or an integral procedure when processing information? It can be unambiguous to say that representativeness largely defines our perception of data streams, helping to identify the most well-meaning and significant.

The ultimate goal of studying the sample aggregate is always obtaining information on the general population. For this, the selective study must satisfy certain conditions. One of the main conditions is representativeness (Review) sampling. As discussed earlier, high-quality and quantitative representativeness is distinguished.

Accident that guarantees high-quality (structural) representativeness of statistical studies is achieved by the implementation of a number of conditions for the formation of sample groups (aggregates):

1. Each member of the general population must have an equal chance to get into the sample.

2. Selection of observation units from the general population must be carried out independently of the trait being studied. If the selection is performed purposefully, then it is necessary to comply with the conditions for the independence of the distribution of the studied attribute.

3. The selection should be carried out from homogeneous groups.

Compliance with the conditions guaranteeing the maximum proximity of the sample and general aggregate is provided by special methods of selection. Depending on the formation method, the following samples distinguish:

1. Samples that do not require the separation of the general population to pieces (actually, random repeated or non-represented sample).

2. Samples requiring the separation of the general population on the part (mechanical, typical or typological sample, cohort, pair-conjugate sample).

Actually, a random sample is formed by a random selection - Mind. The basis of random selection is mixing. For example: Selecting a ball in Sportlo after mixing all the balls, the choice of lottery winning numbers, random choice of patient cards for research, etc. Sometimes random numbers obtained from the tables of random numbers or using random numbers generators. According to these numbers from a predetermined array of the general population, units of observation with numbers corresponding to the random number are selected.

When drawing up a random sample after the object is selected, and all the necessary data is registered about it, you can go twofok: the object can be returned, or not to return to the general population. In accordance with this the sample is called repeated(The object returns to the general population) or captive (The object is not returned to the general population). Since in most statistical studies, the difference between repeated and non-repulsive samples is practically absent, then the condition is accepted that the sample is repeated.

Evaluation of the required sampling

In order for the selective set to be quantitatively representative to generally, it is necessary to initially evaluate the amount of data that is required to be included in the selective totality.

With an unknown value of the general population The magnitude of the re-sample, guaranteeing representative results, if the result is reflected in the form in the form of relative value (share)determined by the formula:

where R is the value of the indicator of the studied sign, in%; q. = (100- p.) ;

t is a confidence coefficient showing what is the likelihood that the dimensions of the indicator will not go beyond the limit error boundaries (T \u003d 2 is usually taken, which provides 95% of the probability of an errorless forecast);

 - limit error of the indicator.

For example: One of the indicators characterizing the health of industrial enterprises is the percentage of workers who have not disappeared during the year. Suppose that for the industrial industry to which the subject belongs, this indicator is 25%. An extreme error that can be allowed to dismiss the values \u200b\u200bof the indicator not exceeding reasonable boundaries, 5%. In this case, the indicator may receive values \u200b\u200bof 25% ± 5%, i.e. from 20% to 30%. Allowing t \u003d 2, we get

In that case, if the indicator is the average valueThe number of observations can be established by the formula:

where σ is a secondary quadratic deviation, which can be obtained from previous studies, or on the basis of trial (aerobatic) studies.

With a non-reversible selection and subject to a well-known general population To determine the required random sample size in the case of use relative values \u200b\u200b(share)the formula is applied:

for medium sizes Formula is used:

where n is the number of general population.

Based on the conditions of the example above and taking the number of general aggregate N.=500 Workers, we get:

It is easy to note that the necessary number of samples with a non-reversible selection is less than when re-(respectively, 188 and 300 workers).

In general, the number of observations needed to obtain representative data varies inversely proportional to the square of the valid error.

Mechanical sample- Sampling, when from the surveyed set unit of observation is selected mechanically. For example: the selection of each fifth or each tenth worker on the card frames of the enterprise or on the outpatient cards of MSH polyclinics.

Typical, typologicalor zoned The sample involves the breakdown of the general aggregate on a number of highly homogeneous groups. For example: when studying the incidence of university students for an in-depth examination, student groups are selected for an in-depth examination at each course. Often this selection method is combined with other ways. For example: the territory of the city is divided depending on the degree of pollution for typical areas, in these areas by accidental selection, observation groups are formed.

Cohort selection refers to targeted seborability. In this case, the method from the general population is selected (distribution to subgroups at the same time is non-random), combined with a moment of the appearance of any sign or the impact, playing a significant role in the study (year of birth, the beginning of the disease, reception of the drug, etc.).

Study by type case-control (SC) - type of epidemiological study, in which the distribution of the risk factor is compared in the group of patients with the disease and the control group. The study (SC) refers to a retrospective, since the researcher, separating patients into groups, according to that, there is or not they have a disease, it finds out information from them from the past.

It should be separately focused on the use of a sample method in sanitary statistics in the study of the total incidence of the population. Theoretical prerequisites of the sample method were tested during special studies. So, V.S. Bykhovsky et al. In 1928, they made parallel processing of 132.8 thousand cards with disease data with a solid method and method of mechanical selection of each fifth card. Analysis of the results of this processing showed the high representativeness of these selective examination of the incidence. However, up to this day, there are no single methodical approaches in the general practice of selective sanitary and statistical studies.

Representative sample

The ultimate goal of studying the sample aggregate is always obtaining information on the general population. For this, the selective study must satisfy certain conditions. One of the main conditions is representativeness (Review) sampling. As discussed earlier, high-quality and quantitative representativeness is distinguished.

Accident that guarantees high-quality (structural) representativeness of statistical studies is achieved by the implementation of a number of conditions for the formation of sample groups (aggregates):

1. Each member of the general population must have an equal chance to get into the sample.

2. Selection of observation units from the general population must be carried out independently of the trait being studied. If the selection is performed purposefully, then it is necessary to comply with the conditions for the independence of the distribution of the studied attribute.

3. The selection should be carried out from homogeneous groups.

Compliance with the conditions guaranteeing the maximum proximity of the sample and general aggregate is provided by special methods of selection. Depending on the formation method, the following samples distinguish:

1. Samples that do not require the separation of the general population to pieces (actually, random repeated or non-represented sample).

2. Samples requiring the separation of the general population on the part (mechanical, typical or typological sample, cohort, pair-conjugate sample).

Actually, a random sample is formed by a random selection - Mind. The basis of random selection is mixing. For example: Selecting a ball in Sportlo after mixing all the balls, the choice of lottery winning numbers, random choice of patient cards for research, etc. Sometimes random numbers obtained from the tables of random numbers or using random numbers generators. According to these numbers from a predetermined array of the general population, units of observation with numbers corresponding to the random number are selected.

When drawing up a random sample after the object is selected, and all the necessary data is registered about it, you can go twofok: the object can be returned, or not to return to the general population. In accordance with this the sample is called repeated(The object returns to the general population) or captive (The object is not returned to the general population). Since in most statistical studies, the difference between repeated and non-repulsive samples is practically absent, then the condition is accepted that the sample is repeated.

We will get acquainted with three concepts that you need to know anyone who somehow comes into contact with sociological studies: the general aggregate, selective aggregate (sample), representativeness.

General Aggregate -these are all units of a specific research program. If we are talking about the All-Russian Public Opinion Survey, it will be all the adult population of Russia. Or all Moscow students, if we take a survey among them. Or all the street children of Kaluga, if we are going to make a socisissession on this topic.

Selective aggregate (sample) -this is part of the general population that we will directly explore, that is, these are the people to whom we turn with questions interviews or with questionnaires; Those materials that we will study the method of content analysis, etc.

Sometimes the sample is equal to the general population (for example, in the case when we interview all students of the first year of the faculty of MGU journalism). But usually it is less, sometimes in a few dozen and hundreds of times. At the same time, the practice of sociological research proved that in nationwide studies it is enough to choose 1.5-2 thousand people for surveys. If the sample is good, correctly, representatively formed, then it can give objective information about the opinion of all Russians.

So, the main thing is to create a sample correctly. The sample size depends on the objectives of the study, the specifics and degrees of the homogeneity of the object of study, the fragmentation of the groups to be studied and the planned degree of its representativeness. What does this magical and the most important thing in empirical sociology mean, "Representativeness"?

Representativeness - This is a compliance, the adequacy of the sample aggregate (sample) on the main characteristics of the general population. If in the structure of the population of 55% of women and 45%; Men, then in the sample there should be the same ratio. The same can be said about age, profession, type of settlement, etc. In short, the configuration of the sample should coincide with the configuration of the general population. This can be portrayed on such a picture (Fig. 8).

The most important thing in a sociological study is the representativeness of the sample, because it is precisely with this that the accuracy and objectivity of the results obtained are related.

The sample can be formed by different paths. But the main types are two representative and unrepresentative samples.

Representative samples

Probabilistic, or random, sampling it is based on the fact that any of the objects of the general population has an equal likelihood of getting into a selective totality. There are several subspecies of the probabilistic sample.

1. Systematic selection. It is very popular and is often applied in socisive. This means that, depending on the size of the sample, it is selected from the general population every n.(6, 20, 45, etc.) Object. For example, we interview the adult population of one of the polling stations. We take electoral lists. Suppose they will have 10,000 people. And we need a sample of 500 people. We divide the number of 10,000 general aggregate on the number of 500 samples, we get 20. So, we will choose from the lists of each twentieth voter.

Suppose that we need to interview Muscovites and find out what at the moment they are watching the transfer on TV. We take a reference phone book, we consider how many numbers in it, we divide this number to the number that we need to interview, and get a step in which we will lead a systematic selection of numbers.

The same can be done with houses on the streets, if we interview our recipients at home. For example, on the even side of the street go to every fifth house. Etc.

2. Selection on the principle of lottery or lot. This method is familiar to you when you throw in the header, a vase, a box, for example, all the streets of Moscow and choose 20 on which you will conduct a study. Regions, settlements, post offices, etc. can also be chosen.

3. Selection by random numbers. For this, special mathematical tables of random numbers are drawn up by the number of selective set and the object is selected, which is marked with a pre-this number.

Quad sample it is formed in accordance with quotas (that is, objects having a certain sign on the floor, age, place of residence, etc.), which in percentage respects the general aggregate. Suppose we explore the population of a small town and know what the percentage of young people, middle-aged people and the elderly, men and women, working and retirees. We must select people to survey people with these characteristics in the same percentage ratio. This sample according to the degree of representativeness is close to probabilistic.

Stratified sample It differs from the quota in that artificially, in connection with the objectives of the study, layers are formed, strata that are subject to study and, as a rule, in quantitative terms they are equal. Strats must be more homogeneous than the whole set. For example, we study readers of different publications: "AIF", "Izvestia", "Labor", "Komsomolskaya Pravda", "MK" and form equal strata of readers of various publications, suppose 200 people.

Izonized sample usually used in the study of regions, often with the use of a geographic map, scheme settlements etc., from which certain units are selected for research. For example, areas are selected from the various geographical zones of Russia, or the district of Moscow. Sometimes the so-called geographic cross technique is used when points are selected on the horizontal and vertical of this geographic cross. So a sample was formed in public opinion studies in the 60s at the Institute of Public Opinion at the Komsomolsk Pravda.

Serial, nest, cluster the sample does not work with units, but with nests, homogeneous groups (family, production team, student group, fans of a football match, TV viewers who watch TV in one room, city districts, etc.). Typically, in this case, a solid survey is carried out.

Requirements for the sample

A number applied to the sample mandatory requirementsdefined, above all, the objectives and objectives of the study. Experiment planning should include accounting as a sample size and a number of its features. So, in psychological research, the requirement is important homogeneoussamples. It means that the psychologist, studying, for example, adolescents, can not include adult people in the same sample. On the contrary, the study made by the method of age cuts fundamentally assumes the presence of multi-time tests. However, in this case, the homogeneity of the sample should be observed, but already in other criteria, primarily such as age, gender. The grounds for the formation of a homogeneous sample can serve different characteristics, such as the level of intelligence, nationality, lack of certain diseases, etc., depending on the objectives of the study.

In general statistics there is a concept repeatedand perfectsamples, or, in other words, samples with refund and without refund. As an example, it is usually given a choice of a ball delivered from any container. In the case of a sampling with a return, each selected balloon returns again to the container and, therefore, can be selected again. With a chartest choice, one day the selected ball is deposited to the side and can no longer participate in the sample. In psychological studies, you can find analogues of this kind of ways to organize a sample study, since the psychologist often accounts for several times to test the same subjects with the same technique. However, strictly speaking, repeated in this case is the test procedure. The sample of the subjects under the complete identity of the composition in the case of repeated studies will always have some differences due to the functional and age variability inherent in all people. A similar sample by the nature of the procedure is repeated, although the meaning of the term here is obviously different than in the case of balls.

It is important to emphasize that all the requirements for any sample are reduced to the fact that on its basis the psychologist must be obtained the most complete, undisputable information on the features of the general population, from which this sample is taken. In other words, the sample should actually reflect the characteristics of the general population studied.

The composition of the experimental sample should represent (simulate) the general population, since the conclusions obtained in the experiment are supposed to be further transferred to the entire general population. Therefore, the sample should have special quality - representativeness, allowing to disseminate the findings obtained on it to the entire general population.


The representativeness of the sample is very important, however, by objective reasons Observe it is extremely difficult. So, well known the fact that from 70% to 90% of all psychological studies of human behavior was conducted in the United States in the 60s of the 20th century with test-students of colleges, most of them were students psychologists. In laboratory studies performed on animals, rats are the most common object of study. Therefore, it is no coincidence that psychology was called earlier "science on sophomore students and white rats." Students of psychological colleges constitute only 3% of the total population of the United States. Obviously, the sample of students is non-residentative as a model applying for the representation of the entire population of the country.

Representativesample, or, as they say, representativethe sample is such a sample in which all major signs of the general population are presented in approximately the same proportion and at the same frequency with which this feature acts in this general population. In other words, the representative sample is smaller in size, but the exact model of the general population, which it should reflect. To the extent that the sample is representative, conclusions based on the study of this sample can be considered as applicable to the entire general population. This distribution is called generalizability.

Ideally, the representative sample should be such that each of the main characteristics studied by the psychologist, the features, features of the personality, etc. It would be presented in proportion to the same features in the general population. According to these requirements, the sampling procedure must have internal logic capable of convincing the researcher, which, when compared with the general set, it will truly be representative, representative.

In its specific activity, the psychologist acts as follows: Sets the subgroup (sample) within the general population, exploring this sample in detail (conducts experimental work with it), and then, if it allows the results of statistical analysis, distributes the resulting conclusions to the entire general aggregate. These are the main stages of the psychologist's work with the sample.

A beginner psychologist should keep in mind a often repetitive error: every time it collects any data to any method and from any source, it always appears temptation to extend his conclusions to the entire general population. In order to avoid such an error, you should not just have common sense, but, first of all, it is good to own the basic concepts of mathematical statistics.

We recommend to read

Top