Synthetic Data Generator

Synthetic Data Generator

Example of Synthetic data generated for public health care facilities in Ukraine

Synthetic Data Generator

The Synthetic Data Generator automates creation of Synthetic data for various use cases. The tool works as a wrapper for the Open buildings data. In other words, using the tool can generate any sort of synthetic data for a particular city or region across the world. For example, we have utilized the tool for UNITAC projects in Namibia and Ukraine where we successfully generated synthetic data related to public spaces such as hospitals, clinics etc., as well as data related to digital job cards based on predefined templates of Kobo toolbox forms. Overall, we have made use of the data generated by the tool to bridge the gap between idea and prototype so that we can manage the expectations of local authorities as well as facilitate tangible milestones and establish data standards.

The entire tech stack of the tool is based on front end light weight technologies and libraries based on HTML, JavaScript and CSS. Users can double click on the HTML and upload the geojson file with buildings data of a particular city, and tool generates the synthetic data automatically. There is a Python based Streamlit wrapper provided in addition to the tool which makes it easy to download the data of a particular city or region, which will internally be fed into the tool for synthetic data generation. 

Want to use the tool in your region?

Contact unitac@un.org for free access to the code and more information about how to use the tool.

Synthetic Data Generator interface

Synthetic Data Generator

Example of Synthetic data generated for service reports in Rundu to prototype the Digital Job Card

Partners

Partners

Partners

Used for internal data workflows to facilitate tool integration and rapid prototyping with Ukraine and Namibia projects and stakeholders.

Impact

Impact

The Synthetic Data Generator

  • Expediates prototyping of digital prototypes and dashboards which require geospatial data.
  • Enables the demonstration of technical solutions or data analysis where the real data is unavailable, inaccurate, or inaccessible.
  • Scales easily to any place on the globe as the data generation is based on specific buildings of a particular city or region.
  • Invaluable for researchers and academics to showcase their work without worrying about data privacy regulations.
  • Can be scaled easily to any type of use-case.
  • Dockerized set-up for ease of deployment across different  environments.

 

Scalability

Scalability

Scalability

The Synthetic Data Generator has been used by the UNITAC team in several projects and prototypes. It is available for other developers and researchers to use. The best part of the tool is that it doesn’t need any server for deployment and can completely function offline with very few requirements to use the tool to generate synthetic data.