Recently, TAGCyber conducted a research survey to analyze how effective synthetic data is viewed in regard to cyber security product testing. Shown below is a description of the research survey and an outline of the scoring participants would use during the study.
A dozen enterprise security practitioners were solicited recently to determine the value on a
scale of 1 through 5 (not valuable to highly valuable) of using synthetic data for cyber security
product testing. The results averaged 3.85 which corresponded to a largely favorable view of
using synthetic data for cyber security product testing.
The values were as follows: Synthetic data need not be used (score = 1), synthetic data should
be used, but not required (score = 2), synthetic data is appropriate and should be used (score =
3), synthetic data should be encouraged and is valuable (score = 4), and synthetic data is a
valuable and required element of our program (score = 5). Participants were encouraged to
answer in a manner that integrated their personal and organizational views.
According to this research survey, the use of synthetic data is becoming encouraged more and more in our daily lives. There are many reasons why synthetic data is being viewed as highly beneficial in this day and age such as its value when it comes to security breaches or its flexible nature regarding production and life-cycle testing. Listed below are a few quotes that reflect the conclusions drawn about synthetic data from the research survey.
“I’d say there is actually an emerging market need for labeled, domain-specific datasets.
It’s far easier to concoct them algorithmically.”
“I think this is at least a 4. We are a software company, not a service provider so we
don't want to touch or see actual customer data as it would require us to be governed
by privacy laws such as GDPR and CCPA.”
“I think I would score this around 3.75 (4 if you need a round number). You’ll likely have
lots of POC’s, and it is definitely good to not use live data for these if possible.”
Based on the results of this research survey, we see great promise for both synthetic data and the value it brings us when integrated with future technologies.