What Is Response Cleaning?
This is the process of cleaning data during survey analysis by excluding insincere responses, responses that misunderstood questions, or outliers that deviate significantly from overall trends. Since cleaned responses are not reflected in analysis results, you can derive more precise and reliable insights.
Why Should You Use Response Cleaning?
The goal of a survey is not simply to collect a lot of data, but to secure accurate data that can be used for decision-making. However, the responses actually collected may include responses where the question was not read carefully or meaningless content was entered. Response cleaning is the process of organizing this data to improve the reliability of analysis.
1️⃣ Securing data reliability for accurate analysis
If randomly selected responses without properly reading questions, or responses containing meaningless text are included, averages and ratios may be distorted compared to reality. Cleaning this 'noise data' through response cleaning gives you reliable analysis results that better reflect actual market conditions.
2️⃣ Improving efficiency in data review and analysis
Checking thousands of responses one by one requires a great deal of time and effort. Using Dataspace's algorithm-based cleaning features, you can quickly identify responses that may be insincere, significantly reducing the time spent on data review and analysis.
3️⃣ Improving sample management and data usability
Organizing excess responses or balancing representative samples (quotas) lets you compose a dataset with respondents appropriate for analysis. This allows you to obtain more meaningful analysis results even from a limited sample.
How to Clean Responses
📌 Response cleaning is available on all plans.
Dataspace provides both manual selection and algorithm-based automatic cleaning features to suit the user's situation.
1️⃣ Review and clean manually
If you want to carefully refine data by reviewing each response one by one, try manually reviewing and cleaning responses.
Step 1. Select the desired survey and navigate to the [Analytics > Response] screen from the top menu.
Step 2. Refer to the following metrics to identify responses that need cleaning.
finished_at: Check whether the response was submitted after the collection deadline.
sincerity_score: Determine whether the response has a low sincerity score that makes it difficult to use for analysis.
💡 Usage tip | Clicking the [Load variables] button in the upper right of the response screen lets you add various reference variables from metadata for a more detailed review.
Step 3. Check the responses to exclude, then click the [Clean] button in the lower right of the screen.
💡 Usage tip | You can also enter the response ID or UID directly in 'Response cleaning' at the bottom of the response screen to individually clean only specific responses.
2️⃣ Use one-click cleaning
If you want to quickly organize a large volume of collected responses, try using Opensurvey's algorithm-based one-click cleaning feature. It automatically classifies cleaning targets based on the sincerity score, making the data cleaning process much more efficient.
Step 1. Select the desired survey and navigate to the [Analytics > Response] screen from the top menu.
Step 2. Click the [One-click cleaning] button in the upper right of the 'Response' screen.
Step 3. Referring to the sincerity score (sincerity_score) distribution graph displayed, set the cleaning threshold score. Responses subject to cleaning are automatically classified according to the set score threshold.
Step 4. After reviewing the classified responses, click the [Clean] button to exclude those responses from analysis.
3️⃣ Excluding excess responses
When running a survey, there are cases where more data is collected than the target response count. In particular, when collecting responses based on collection groups (e.g., gender or age-based quota samples), some groups may exceed their response target. In such cases, cleaning up excess responses lets you maintain the target sample structure while organizing the data for analysis.
Step 1. Select the desired survey and navigate to the [Analytics > Response] screen from the top menu.
Step 2. In the 'Response cleaning' area at the bottom of the response screen, activate the 'View response status' setting.
Step 3. For each collection group, you can check the total response count, number of excess responses, number of cleaned responses, number of valid responses, collection target count, and the number of responses that fall short or exceed the target. This information lets you see at a glance which collection groups collected more responses than the target.
💡 Usage tip | Selecting the 'Rows exceeding target' or 'Rows falling short of target' checkboxes lets you view only the collection groups that meet those conditions separately.
Step 4. Click the [Exclude overquota responses] button, and responses that exceed the target for each collection group are randomly excluded, organizing the data to match the target sample.
💡 Usage tip | Using the exclude overquota responses feature prevents responses from being excessively concentrated in specific groups and allows you to analyze data while maintaining the sample balance set during the survey design stage.
🧙 Want to restore cleaned responses?
Responses excluded through cleaning can be restored to valid responses at any time.
In the response screen, select responses displayed with 'Cleaned' status, then click the [Validate] button in the lower right. The selected responses are included in analysis again.
You can also restore responses by directly entering the response ID in 'Restoration target' in the 'Response cleaning' area at the bottom of the response screen.
If you want to review a response in detail before restoring it, try clicking the arrow icon to the right of the response, then selecting and validating the response from the 'View Response Details' screen.
💡 The Response screen has even more features available.
✅ Load variables
Load variables is a feature for loading additional variables to view in the response table. A variable is one of the components of data — an item that can be measured or calculated, such as age, gender, region, and response results.
Loading new variables
step 1. Click the [Load variables] button in the upper right of the response table.
step 2. When the Load variables screen appears, select the desired tab from response, profile, or metadata data tabs, then select the variables to load and click the [Done] button.
Response: Loads response data for survey questions.
Profile: Loads respondent profile data as variables. In Dataspace, profile variables are only displayed for surveys conducted with respondents registered as My Panel.
Metadata: Loads supplementary information related to responses. For example, panel ID, response start time, response end time, cleaning reference variable, cleaning reason, and number of insincere responses are included.
step 3. When variables are loaded, the corresponding variable data is added to the response table, and you can review each respondent's data by variable in the response table.
If you want to remove already-loaded variables or add new ones, go back to the [Load variables] screen, select or deselect the desired variables, and click the [Done] button.
Frequently Asked Questions
Q. Can image responses also be cleaned?
A. Yes, they can. First, add the image response variable from [Load variables].
Then, click the arrow icon to the right of the response in the response list to navigate to the 'View Response Details' screen. On this screen, you can directly view the image attached by the respondent.
After reviewing the image, decide whether to clean that response.
Q. Do I have to exclude excess responses to match the representative sample?
A. It is recommended depending on the research purpose. For surveys that need to understand market conditions or represent the population, if responses are collected in excess for a specific group (e.g., a specific age group), the analysis results may be distorted. In such cases, use the [Exclude overquota responses] feature to balance the sample composition.
Q. What criteria are best for cleaning?
A. It is important to set criteria appropriate to the research purpose and situation. The standard for a 'sincere response' may vary depending on the company or research purpose. It is best to start by reviewing text response content and response speed together rather than judging by score alone. Based on this, set appropriate cleaning criteria suited to the characteristics of your project.
Have your questions about response cleaning been resolved?
If you're unsure how to handle a specific response during the data cleaning process, or if a feature is not working as expected, please contact us anytime via the [Customer Support icon] in the bottom right corner of your screen.
Our team will do its best to help you resolve any difficulties you're experiencing.
