Skip to main content

Glossary of Analytics Terms

Learn the various statistical and specialized terms used in Opensurvey analytics reports. Before you start your analysis, review the meanings of key terms to find insights hidden in your data more accurately.

Updated yesterday

1️⃣ Basic Analytics Terms

These are the foundational terms for understanding the overall collection status and reliability of your data.

  • Number of respondents: The number of unique respondents (Unique Sample) who participated in the survey. For standard surveys where each respondent can only submit one response, the number of respondents and the number of responses are the same.

  • Number of responses: The total count of responses collected. For diary surveys where respondents can participate multiple times in a single survey, the number of respondents and the number of responses may not match.

  • Number of variables: Displays the total number of variables in the survey.

  • Sampling error: A figure that indicates how much the survey results obtained from a sample may differ from the actual opinions or behaviors of the entire population. Generally, under the same confidence level, the larger the sample size relative to the population, the smaller the sampling error; the smaller the sample size, the larger the sampling error. Opensurvey displays the maximum sampling error that may occur under an 80% confidence level, based on the number of respondents who have completed the survey to date.

  • Survey method: The method by which the survey was conducted. If the survey was conducted using Opensurvey's own panel, it will be displayed as 'Opensurvey Panel.' If respondents were recruited using a web-based survey link other than the panel, it will be displayed as 'Opensurvey Form.'

  • Response period: Displays the period from when the first response was submitted to when all responses were completed.

  • Population: The entire set of subjects of interest from which information is sought. This population can be defined differently depending on what you want to know. If you want to know citizens' opinions on Seoul city policies, the entire population of Seoul becomes the population; if you want to know the satisfaction of consumers who purchased from Shopping Mall A, it is limited to purchasers of Shopping Mall A. Since surveying the entire population is often not possible, a survey is typically conducted by extracting a sample from the population.

  • Confidence level: When a survey is conducted by sampling from a population, the sample extracted each time is different, so results are not exactly the same. The confidence level indicates how likely the survey results obtained at the current sample size are to be replicated. In analytics, sampling error is measured at a default 80% confidence level to allow closer examination of even small differences between groups. If 1,000 people are randomly sampled from an infinitely large population and surveyed, the maximum sampling error at an 80% confidence level is ±2.0%p. If satisfaction for a certain product is measured at 50% in this survey, you can expect that out of 100 surveys, 80 will fall between 48.0% and 52.0%.

2️⃣ Crosstab-Related Terms

These are terms used when comparing differences between groups and understanding associations between variables.

  • Crosstab: An analysis method for understanding the association between two variables. It calculates the column percentage (Column %) of the analysis target based on the analysis unit. For example, to find out if there are gender differences in the response results for a specific variable, you would set gender as the analysis unit and the response results of that variable as the analysis target to create a crosstab.

  • Analysis unit: The analysis unit is the reference information for interpreting data when creating a crosstab. This analysis unit appears on the horizontal axis of the crosstab, and the data in the crosstab changes depending on what the analysis unit is.

  • Analysis target: The analysis target refers to the data you want to interpret in the crosstab. This analysis target is positioned on the vertical axis of the crosstab, and the data for the analysis target is displayed based on the configured analysis unit.

  • Percentage: A button that sets the crosstab to display data based on response percentages. This percentage is calculated by dividing the number of responses for the analysis target under a given analysis unit by the total number of responses for that analysis unit.

  • Frequency: Selecting 'Frequency' in the crosstab displays data based on the count of responses. If you want to display both percentage and frequency, select 'Percentage % (Frequency).'

3️⃣ Numeric Variable Terms

  • Mean: The value obtained by dividing the sum of all response values by the number of responses. In analytics, the arithmetic mean is displayed as 'Mean.'

  • Standard deviation: A figure indicating the dispersion of data. The larger the value, the more widely spread the data; the smaller the value, the more closely clustered around the mean.

  • Mode: The most frequently occurring value among all responses. If there are two or more modes, the smallest value among them is displayed. It is used to find the value that can satisfy the most people or to check the distribution of the entire dataset, which is difficult to understand from the mean alone.

  • Maximum / Minimum: The largest / smallest value among all responses. Useful for determining whether outliers or extreme values are included in all response values or for understanding the range of responses.

  • Median: The value positioned at the center when all response values are arranged in order of size. Since the mean is greatly affected by extreme values, the median can serve as a useful representative value for data containing extreme values. For normally distributed data, the mean, median, and mode appear similar; if the mean is much larger or smaller than the median, there is a possibility that outliers or extreme values are present.

  • Percentile 30: The number corresponding to the top 30% when all responses are arranged in descending order. When dividing response values into three groups (low / mid / high), Percentile 30 can be used as the cutoff point for classifying the high group.

  • Percentile 70: The number corresponding to the top 70% when all responses are arranged in descending order. When dividing response values into three groups (low / mid / high), Percentile 70 can be used as the cutoff point for classifying the low group.

4️⃣ Rating Variable Terms

TOP and BOTTOM

When analyzing rating responses, you often need to look at the proportion of clearly positive or clearly negative opinions, excluding responses that show a neutral tendency. For the most commonly used 5-point scale rating variable, it is common to break down and look at TOP 2 (positive perception) and BOTTOM 2 (negative perception).

  • TOP 2: For a 5-point scale rating variable, TOP 2 represents the sum of the proportions of respondents who gave 5 and 4 points.

  • BOTTOM 2: For a 5-point scale rating variable, BOTTOM 2 represents the sum of the proportions of respondents who gave 1 and 2 points.

5️⃣ NPS Variable Terms

The NPS variable conducted on an 11-point scale divides and analyzes response results based on the following criteria.

  • NPS: NPS (Net Promoter Score), translated as 'Net Promoter Score' in Korean, is an indicator of customer loyalty. It calculates the proportion of promoters and detractors on an 11-point scale from 0 to 10, and the score can range from -100 to +100.

  • Promoters: Customers who have high satisfaction with our brand/product and actively recommend it to others. Respondents who gave 9–10 points in the NPS survey are classified as promoters.

  • Passives: Customers who are somewhat satisfied but could leave at any time if a better brand/product appears. Respondents who gave 7–8 points in the NPS survey are classified as passives.

  • Detractors: Customers who are not satisfied with our brand/product and may also relay negative feedback to others. Respondents who gave 0–6 points in the NPS survey are classified as detractors.

The TOP & BOTTOM criteria shown on the results screen vary slightly depending on the type of scale used for the survey. The table below shows the specific criteria.

Scale

Criteria

3-point scale

Since 1 point = Bottom and 3 points = Top, TOP/BOTTOM is not displayed separately

4, 5, 6-point scale

TOP 2 / BOTTOM 2

7, 9, 10-point scale

TOP 3 / BOTTOM 3

11-point scale

Based on Net Promoter Score: 0–6 points = Detractors, 7–8 points = Passives, 9–10 points = Promoters

Mean and Standard Deviation

Strictly speaking, rating variables are not interval scales, so the mean may not be entirely appropriate. In other words, the difference between 5 points ('Very Satisfied') and 4 points ('Satisfied') is not the same as the difference between 4 points ('Satisfied') and 3 points ('Neutral'). However, since it is customary to look at the mean and standard deviation, Opensurvey also provides the mean for convenience.

  • Mean: Represents the arithmetic mean of all numeric response data.

  • Standard deviation: Indicates how spread out the data is from the mean. The larger the value, the more widely distributed the data; the smaller the value, the more closely it is distributed around the mean.

Did this answer your question?