Interested in our Smart Data?
Provide us with your email and we'll send you a data sample of what we do! Each set of data will have 10 rows of data each and will come in the file-format deemed most appropriate for the data set. Full descriptions of each data set can be found below.
1) Synthetic IRS 1040EZ Tax Sample:
The 1040EZ 2012 data model produces simulated 2012 1040 EZ tax returns in MeF (Modern Electronic Filing) along with the same data formatted as images of paper returns and data in CSV format. The sample provided contains a data csv, a folder containing PDF images of 1040EZ paper submissions, a zip folder containing MeF formatted 1040EZ 2012 paper submissions, and manifest and submission xml files.
2) SF86 Sample:
Standard Form 86 (SF 86) is a United States form that individuals complete when applying for a US security clearance. The form is very detailed and 136 pages long. The sample is meant to represent a Proof of Concept that simulated data can be very useful for testing software algorithms that detect anomalies in the data. The following simplifying assumptions apply to current data sets:
Follow the Money is a data model that produces data sets that are used to develop and score fraud detection algorithms. Follow the Money is also useful in the Financial Services and Medical Billing industries. Note: the sample pictured is not a replica of the sample you will receive. The picture is based off a very large data set. For the Follow the Money data sample, we will provide:
HR & Payroll is a data model that produces data sets that are used in HR and Payroll applications. The HR outputs list employees, addresses and salaries while Payroll provides pay stub data including earnings to date. 2 payroll output files are provided to demonstrate the longitudinal aspects of payroll - things change over time. Examples include new hires and personnel departures. Instances of these events are tracked in the scenario answer file.