Dataset Repository Overview
This section provides a collection of diverse AM datasets that can be used for training and benchmarking GenAI models. These datasets include design, pre-process, build and monitoring, post-process, and quality measurement data, which are essential for developing robust and accurate GenAI models. By leveraging these datasets, researchers can enhance model performance, improve predictive capabilities, and drive innovations in AM.
Dataset Categories:
- Benchmarking Datasets: Standardized datasets for evaluating and comparing GenAI model performance
- Training Datasets: Comprehensive collections for fine-tuning GenAI models for AM-specific tasks
- Datasets for Agents: Structured data for developing and testing GenAI agent-based systems
- Prompt Collections: Curated prompts and templates for effective GenAI interactions in AM contexts
Each dataset is provided with documentation on its structure, content, and recommended applications. For researchers and practitioners implementing GenAI in AM workflows, these resources provide a solid foundation for model development and evaluation.
Benchmarking Datasets
Benchmarking datasets are specifically designed to evaluate and compare the performance of different GenAI models on standardized AM tasks. These datasets include reference problems, expected outputs, and evaluation metrics to ensure consistent assessment.
AM-GenAI Benchmark Dataset
A comprehensive benchmark dataset for evaluating GenAI models across common AM tasks including design optimization, parameter selection, and defect classification.
Using Benchmarking Datasets
When utilizing these benchmarking datasets, consider the following best practices:
- Use the provided evaluation metrics for consistent comparison
- Report all relevant model parameters and training details
- Benchmark against the provided baseline model performances
- Consider both quantitative metrics and qualitative assessments
- Document any preprocessing or modifications to the dataset
Training Datasets
Training datasets are comprehensive collections designed for fine-tuning GenAI models for AM-specific tasks. These datasets include diverse examples that help models learn the particular characteristics and requirements of additive manufacturing processes.
AM Design Corpus
A large collection of 3D models specifically designed for additive manufacturing, with associated metadata on design intent, manufacturing constraints, and performance requirements.
AM Defect Classification Dataset
PublicLabeled image dataset of common AM defects across multiple process types, with annotations on defect type, severity, and probable causes.
Datasets for Agents
These specialized datasets are designed for developing and evaluating GenAI agent-based systems in AM environments. They include structured formats for agent interactions, decision-making scenarios, and evaluation frameworks.
AM Agent Interaction Dataset
Collection of agent-human interactions in AM contexts, providing examples of effective communication, problem-solving, and decision-making for training agent-based systems.
Agent Evaluation Frameworks
In addition to the datasets, we provide evaluation frameworks for assessing agent performance in AM contexts:
- Decision Quality Metrics: Evaluating the effectiveness of agent decisions against expert benchmarks
- Interaction Quality Assessment: Measuring the clarity and usefulness of agent communications
- Process Optimization Evaluation: Assessing improvements in manufacturing outcomes
- Adaptation Capability Testing: Measuring how effectively agents adapt to novel scenarios
Prompt Collections
This section provides guidelines and collections for structuring effective prompts when working with GenAI models in AM contexts. Well-designed prompts are crucial for getting desired outputs from GenAI models.
AM Prompt Engineering Collection
Curated collection of effective prompts for various AM tasks, organized by technique (Zero-shot, Few-shot, Chain-of-thought, ReAct, and Directional Stimulus Prompting) with examples and explanations.
Prompt Engineering Techniques
The prompt collections include examples of five popular prompt engineering methods applied to AM-specific tasks:
- Zero-shot: Prompts that elicit responses without providing examples
- Few-shot: Prompts that include examples of expected outputs
- Chain-of-thought: Prompts that guide the model through a reasoning process
- ReAct: Prompts that combine reasoning and actions in a structured format
- Directional Stimulus: Prompts that guide the model toward specific types of responses
For more information on effective prompt engineering for AM, see our Prompt Engineering Tutorial.
Contributing Data
The AM dataset repository is a community resource that welcomes contributions from researchers and practitioners. By sharing your datasets, you can help advance the field of GenAI in additive manufacturing.
Contribution Guidelines
To contribute datasets to the repository, please follow these guidelines:
- Provide clear documentation of dataset structure and content
- Include information on data collection methodology
- Specify any licensing or usage restrictions
- Remove any sensitive or proprietary information
- Include suggested applications and example usage
Explore Related Resources
Learn how to effectively use these datasets with our tutorials on benchmarking and evaluation: