ExactData

  • About
  • Dataset Sale
  • Applications
  • Contact
  • Data Blog
  • Partners
  • Resources
  • Sample Data
  • Smart Data
  • About
  • Dataset Sale
  • Applications
  • Contact
  • Data Blog
  • Partners
  • Resources
  • Sample Data
  • Smart Data

The Data Blog

Data Blog

An Overview of Synthetic Data Generation Technical Approaches

11/19/2021

2 Comments

 
​There are two fundamental approaches to generating synthetic data. The first involves accessing and modifying/masking a production database, either manually or through use of Extract Transform Load technologies or analyzing using AI and generating a facsimile that mirrors some attributes. The second approach does not involve the use of production databases and generates fully synthetic data. The fidelity of this data can vary wildly from random alphanumeric characters to high fidelity synthetic data correlated to the field level systems of systems with business logic and workflow rules, correct statistical distributions, correlation over the time axis, high use case coverage, engineered errors and systems response files. A method based on the use of a production data source is typically best if policy decisions are being made with the data. A fully synthetic data approach is generally better suited for most other use cases. You do not need access to a confidential database that might not exist for your future state system, there are no privacy restrictions on how you can use the data, it is less expensive and faster and with a known ground truth and with expected system response files you can measure and improve system error rates.
2 Comments

    Archives

    April 2025
    August 2023
    April 2022
    March 2022
    November 2021
    October 2021
    September 2021
    August 2021
    July 2021
    June 2021
    April 2021
    March 2021
    February 2021
    January 2021
    December 2020
    November 2020
    October 2020
    September 2020
    August 2020
    July 2020
    June 2020
    May 2020
    April 2020
    March 2020
    February 2020
    January 2020
    December 2019
    November 2019
    October 2019
    September 2019
    August 2019
    July 2019
    June 2019
    May 2019
    April 2019
    March 2019
    February 2019

    Categories

    All
    Artificial Data
    Cyber Data
    Interview
    Other
    Smart Data

    RSS Feed

    Data Blog

Questions? Contact us today, we'd love to hear from you!


Hours

M-F: 9am - 5pm

Email

[email protected]