What Is Response Cleaning?
Response cleaning is the process of refining data by excluding insincere responses, responses based on a misunderstanding of the question, or outliers that deviate significantly from overall trends during survey analysis. Since cleaned responses are not reflected in analysis results, you can derive more precise and reliable insights.
Why Should You Use Response Cleaning?
The purpose of a survey is not simply to collect large amounts of data, but to secure accurate data that can be used for decision-making. However, collected responses may include answers from respondents who did not fully read the questions, or who entered meaningless content. Response cleaning is the process of organizing such data to improve the reliability of analysis.
1️⃣ Securing data reliability for accurate analysis
When responses that were selected randomly without properly reading the questions, or meaningless text, are included, averages and ratios can be distorted from their true values. By cleaning such 'noise data' through response cleaning, you can obtain highly reliable analysis results that better reflect actual market conditions.
2️⃣ Improving efficiency in data review and analysis
Individually checking thousands of responses requires a great deal of time and effort. By using Dataspace's algorithm-based cleaning features, you can quickly identify responses likely to be insincere, significantly reducing the time spent on data review and analysis.
3️⃣ Improving sample management and data usability
By organizing over-collected responses or balancing representative samples (Quotas), you can compose a data set with respondents suitable for analysis. This allows you to obtain more meaningful analysis results even from a limited sample.
How to Clean Responses
📌 Response cleaning is available on all plans.
Dataspace provides both manual review cleaning and algorithm-based automatic cleaning features to suit your situation.
1️⃣ Clean by reviewing responses individually
If you want to carefully refine data by checking each response one by one, use the manual review cleaning method.
Step 1. Select the desired survey, then navigate to [Analytics > Responses] from the top menu.
Step 2. Select responses to clean by referring to the following indicators.
Response Quality Score: Calculated by combining the pattern-based score (sincerity_score) and the AI response quality score (ai_quality_score). Allows you to assess response sincerity by analyzing the respondent's response patterns (e.g., response speed) and the content of open-ended responses.
Pattern-based score: A score that evaluates a respondent's sincerity based on response patterns such as response speed.
AI response quality score: A score evaluated by AI analyzing the context of open-ended responses to assess answers unrelated to the question or meaningless responses.
💡 Usage tip | If the 'AI Response Quality Inspection' feature is enabled in the collection group, the AI response quality score will be reflected alongside the pattern-based score when calculating the Response Quality Score. Conversely, if the feature is disabled, only the pattern-based score will be reflected.
💡 Usage tip | Clicking the [Load Variables] button in the top right of the Responses screen allows you to add various reference variables from metadata for more detailed review.
Step 3. Check the responses to exclude, then click the [Clean] button in the bottom right of the screen.
💡 Usage tip | In the 'Response Cleaning' section at the bottom of the Responses screen, you can also clean specific responses individually by directly entering the response ID or UID.
2️⃣ Use one-click cleaning
If you want to quickly organize a large volume of collected responses, use Opensurvey's algorithm-based one-click cleaning feature. It automatically classifies responses to be cleaned based on multiple review criteria, making the data cleaning process much more efficient.
Step 1. Select the desired survey, then navigate to [Analytics > Responses] from the top menu.
Step 2. Click the [One-click Cleaning] button in the top right of the 'Responses' screen.
Step 3. Set the cleaning threshold score by referring to the Response Quality Score graph. Responses to be cleaned are automatically classified according to the set score threshold.
Step 4. After reviewing the classified responses, click the [Clean] button to exclude those responses from analysis.
3️⃣ Exclude over-collected responses
In the course of running a survey, more data may be collected than the target number of responses. Especially when collecting responses based on collection groups (e.g., quota samples by gender or age), some groups may exceed the target. In such cases, organizing the excess responses allows you to maintain the target sample structure and tidy up the data to be used for analysis.
Step 1. Select the desired survey, then navigate to [Analytics > Responses] from the top menu.
Step 2. At the bottom of the Responses screen, enable the 'View Response Status' setting in the 'Response Cleaning' area.
Step 3. For each collection group, you can check the total number of responses, number of excess responses, number of cleaned responses, number of valid responses, collection target, and the number of responses that are short of or in excess of the target. This information allows you to see at a glance which collection groups have collected more responses than the target.
💡 Usage tip | By selecting the 'Rows exceeding target' or 'Rows below target' checkboxes, you can view only the collection groups that meet those conditions.
Step 4. Click the [Exclude Excess Responses] button to randomly exclude responses that exceed the target for each collection group, organizing the data to match the target sample.
💡 Usage tip | Using the Exclude Excess Responses feature prevents responses from being overly concentrated in specific groups, and allows you to analyze data while maintaining the sample balance set during the survey design stage.
4️⃣ Exclude automated bot responses
📌 Bot Response Monitoring settings are available on the Enterprise plan.
When conducting an online survey, automated programs or macros — rather than real respondents — may participate in the survey and leave responses.
By using the Bot Response Monitoring feature, you can check a score that analyzes the likelihood of a bot response based on various behavioral patterns observed during the survey participation process. This helps you identify suspected bot responses and clean them. (📖 Reference: Bot Response Monitoring Settings)
step 1. Select a survey for which 'Bot Response Monitoring' was enabled when creating the collection group, then navigate to [Analytics > Responses].
step 2. Click 'Load Variables', then check the bot assessment score via [Metadata > recaptcha score].
step 3. Referring to the criteria below, select the responses you want to exclude and click [Clean].
0.1 or below: Highly likely to be a bot — cleaning is strongly recommended.
0.2 – 0.4: May be a bot, but could also be a real respondent — reviewing the response content together before making a judgment is recommended.
0.5 or above: Likely to be a real respondent.
🧙 Want to restore a cleaned response?
Responses that were excluded through cleaning can be restored as valid responses at any time.
In the Responses screen, select responses displayed as 'Cleaned', then click the [Validate] button in the bottom right. The selected responses will be included in analysis again.
You can also restore responses by directly entering the response ID in 'Restore Target' in the 'Response Cleaning' section at the bottom of the Responses screen.
If you want to review a response in detail before restoring it, click the arrow icon to the right of the response, then use the 'View Response in Detail' screen to select and validate the response.
💡 The Responses Screen Has More Features
✅ Load Variables
Load Variables is a feature that allows you to add additional variables to view in the response table. Variables are measurable and calculable items — components of data such as age, gender, region, and response results.
Loading New Variables
step 1. Click the [Load Variables] button in the top right of the response table.
step 2. When the Load Variables screen appears, select the desired tab from Responses, Profile, or Metadata, then select the variables to load and click [Confirm Selection].
Responses: Loads the response data for survey questions.
Profile: Loads respondent profile data as variables. In Dataspace, profile variables are displayed only for surveys conducted with respondents registered as My Panel.
Metadata: Loads additional information related to responses. Examples include Panel ID, response start time, response end time, cleaning reference variables, cleaning reason, and number of insincere responses.
step 3. Once variables are loaded, the corresponding variable data is added to the response table, and you can check each respondent's data by variable together in the response table.
If you want to remove already-loaded variables or add new ones, go back to the [Load Variables] screen, select or deselect the desired variables, and click [Confirm Selection].
Frequently Asked Questions
Q. Can I also clean image responses?
A. Yes, you can. First, add the image response variable via [Load Variables].
Then, click the arrow icon to the right of the response in the response list to navigate to the 'View Response in Detail' screen. You can directly check the images attached by the respondent on this screen.
After reviewing the images, decide whether to clean the response.
Q. Do I have to exclude excess responses to match the representative sample?
A. It may be recommended depending on the purpose of the survey. For surveys aimed at understanding market conditions or representing the population, if responses from a specific group (e.g., a specific age group) are collected in excess of the target, the analysis results may be distorted. In such cases, use the [Exclude Excess Responses] feature to balance the sample composition.
Q. What criteria should I use for cleaning?
A. It is important to set criteria that match the purpose and context of the survey. The standard for 'sincere responses' can vary depending on the company or survey purpose. Initially, rather than judging solely by score, it is advisable to check text response content and response speed together to determine the criteria. Based on this, set appropriate cleaning criteria suited to the characteristics of the project.
Have your questions about response cleaning been resolved?
If you are unsure how to handle a specific response during the data cleaning process, or if a feature is not working properly, please feel free to contact us at any time via the [Help Center icon] in the bottom right corner of the screen.
Our team will do our best to help you resolve any difficulties you're experiencing.
