Synthetic Dataset Generation Tool
Introducing Synthetic Dataset Generation to the Healthcare Industry using Unity Computer Vision and Perception Packages.
Introducing the latest innovation for healthcare: our powerful application designed to revolutionise the healthcare industry. Using Unity Computer Vision, we’ve developed an app that empowers our client to generate a vast array of images of their at-home finger-prick blood test kits.
We worked closely with our client and understand the importance of timely and precise responses in patient care, with a focus on efficiency and accuracy. That’s why our app harnesses the latest advancements in artificial intelligence and image recognition software, enabling healthcare providers to train their systems quickly and effectively.
How Does it Work?
Unity’s Perception package is designed to enhance supervised Machine Learning (ML) pipelines by providing new ways to generate a large quantity of labelled synthetic datasets using Unity Simulation.
We’ve incorporated these features into our application, and provided some user friendly controls for customising how these synthetic datasets are created, giving users control over lighting, scene clutter, quantity, and test kit specific settings.
All of this combined creates a powerful tool allowing users to generate a variety of custom datasets which can be used to train their AI image recognition tools with accurate life-like data without needing to manually construct and label thousands of images. Our client can now generate thousands of labelled synthetic datasets in minutes rather than weeks, leading to huge savings in both time and cost to train their AI model.
Tools like this are incredibly useful and powerful for training AI as they reduce the necessity to gather and label thousands of real-world images. Real-world images are of course desired in AI training (if not essential for a successful model), but Unity have shown that a model trained on both real-world and synthetic datasets increases accuracy:
“Mean Average Precision averaged across IoU thresholds of [0.5:0.95] (mAP), Mean Average Precision with a single IoU threshold of 0.5 (mAPIoU=0.5), and the Mean Average Recall with a maximum of 100 detections (mAR) measured on a held-out set of 254 real-world images.”
Training Data (number of training examples) | mAP | mAPIoU=0.5 | mAR100 |
1.1 Real World (760) | 0.48 | 0.73 | 0.59 |
1.2 Synthetic (400,000) | 0.40 | 0.62 | 0.52 |
1.3 Synthetic (400,000) + Real World (76) | 0.60 | 0.83 | 0.67 |
1.4 Synthetic (400,000) + Real World (380) | 0.68 | 0.89 | 0.74 |
1.5 Synthetic (400,000) + Real World (760) | 0.70 | 0.92 | 0.75 |
Outcome
With just a few clicks, our client can now generate a large volume of datasets, each meticulously tagged and categorised to facilitate the training of their AI model. This streamlined process ensures that our partners have access to the most up-to-date and accurate data, ultimately leading to faster and more accurate responses for patients.
The software speeds up the process of both generative AI learning modelling and also validation and verification of existing algorithms and image analysis.
Other Applications
There are many applications for tools like this in and out of healthcare. Synthetic dataset generation tools can help train AI for a variety of object detection and image recognition tasks, such as identifying products, actions, facial expressions, and more!
Synthetic dataset generation has many benefits, including not only improving the accuracy of the AI model, but also reducing the cost of dataset generation, and reducing interaction times. AI trained with synthetic data can be used in a variety of applications, including patient monitoring, AR inspection tools, crowd scanning, and much more.
How does the tool contribute to the speed of AI model training?
By enabling the generation of thousands of labeled images in minutes, the tool significantly speeds up the AI training process compared to manually constructing and labeling images, which could take weeks.
What are the performance metrics mentioned for AI models trained with synthetic and real-world data?
Key performance metrics include Mean Average Precision (mAP), Mean Average Precision at a single IoU (intersection over union) threshold (mAPIoU=0.5), and Mean Average Recall (mAR) with a maximum of 100 detections.
What are some other applications of synthetic dataset generation in AI?
Beyond healthcare, synthetic dataset generation can be used for patient monitoring, AR inspection tools, crowd scanning, and other applications requiring precise object detection and image recognition.
What are the main advantages of using synthetic datasets?
Synthetic datasets reduce the need for manually gathering and labeling real-world images, saving significant time and costs. They also help improve the accuracy of AI models when combined with real-world data.
Can the tool be customised for specific needs?
Yes, users can customise the synthetic datasets by adjusting various parameters, including lighting, scene clutter, the number of images, and settings specific to the object(s) being trained, allowing for tailored dataset generation.
The focus object can be changed to fit your needs, as well as changing the background and foreground noise to help train your model with appropriate life-like imagery.