Machine Learning · Computer Science
Beyond Model Collapse: Scaling Up with Synthesized Data Requires Verification
Yunzhen Feng, Elvis Dohmatob, Pu Yang, Francois Charton +1
2024-10-28
Machine Learning · Computer Science
Bridging the Generalisation Gap: Synthetic Data Generation for Multi-Site Clinical Model Validation
Bradley Segal, Joshua Fieggen, David Clifton, Lei Clifton
2025-04-30
Machine Learning · Computer Science
Testing Framework for Black-box AI Models
Aniya Aggarwal, Samiulla Shaikh, Sandeep Hans, Swastik Haldar +2
2021-02-12
Software Engineering · Computer Science
An empirical study of testing machine learning in the wild
Moses Openja, Foutse Khomh, Armstrong Foundjem, Zhen Ming +3
2024-07-16
Machine Learning · Computer Science
Quality Matters: Evaluating Synthetic Data for Tool-Using LLMs
Shadi Iskander, Nachshon Cohen, Zohar Karnin, Ori Shapira +1
2024-09-27
Machine Learning · Computer Science
Using Synthetic Data to estimate the True Error is theoretically and practically doable
Hai Hoang Thanh, Duy-Tung Nguyen, Hung The Tran, Khoat Than
2025-11-04
Software Engineering · Computer Science
Large Language Models Synergize with Automated Machine Learning
Jinglue Xu, Jialong Li, Zhen Liu, Nagar Anthel Venkatesh Suryanarayanan +4
2024-09-10
Information Retrieval · Computer Science
Towards Understanding Bias in Synthetic Data for Evaluation
Hossein A. Rahmani, Varsha Ramineni, Emine Yilmaz, Nick Craswell +1
2025-10-07
Computer Vision and Pattern Recognition · Computer Science
Generating Synthetic Satellite Imagery With Deep-Learning Text-to-Image Models -- Technical Challenges and Implications for Monitoring and Verification
Tuong Vy Nguyen, Alexander Glaser, Felix Biessmann
2024-04-12
Formal Languages and Automata Theory · Computer Science
Model Learning: A Survey on Foundation, Tools and Applications
Shahbaz Ali, Hailong Sun, Yongwang Zhao
2019-01-08
Computers and Society · Computer Science
Simulation as Reality? The Effectiveness of LLM-Generated Data in Open-ended Question Assessment
Long Zhang, Meng Zhang, Wei Lin Wang, Yu Luo
2025-02-11
Computation and Language · Computer Science
AutoGeTS: Knowledge-based Automated Generation of Text Synthetics for Improving Text Classification
Chenhao Xue, Yuanzhe Jin, Adrian Carrasco-Revilla, Joyraj Chakraborty +1
2025-08-15
Computer Vision and Pattern Recognition · Computer Science
Accelerating Domain-Aware Electron Microscopy Analysis Using Deep Learning Models with Synthetic Data and Image-Wide Confidence Scoring
Matthew J. Lynch, Ryan Jacobs, Gabriella Bruno, Priyam Patki +2
2025-09-04
Machine Learning · Computer Science
An Inductive Synthesis Framework for Verifiable Reinforcement Learning
He Zhu, Zikang Xiong, Stephen Magill, Suresh Jagannathan
2019-07-18
Software Engineering · Computer Science
Synthetic Test Data Generation Using Recurrent Neural Networks: A Position Paper
Razieh Behjati, Erik Arisholm, Chao Tan, Margrethe M. Bedregal
2024-07-09
Computation and Language · Computer Science
Synthetic Data Generation with Large Language Models for Text Classification: Potential and Limitations
Zhuoyan Li, Hangxiao Zhu, Zhuoran Lu, Ming Yin
2023-10-16
Computation and Language · Computer Science
On LLMs-Driven Synthetic Data Generation, Curation, and Evaluation: A Survey
Lin Long, Rui Wang, Ruixuan Xiao, Junbo Zhao +3
2024-06-24
Computer Vision and Pattern Recognition · Computer Science
AutoSimulate: (Quickly) Learning Synthetic Data Generation
Harkirat Singh Behl, Atılım Güneş Baydin, Ran Gal, Philip H. S. Torr +1
2020-08-20
Computation and Language · Computer Science
An Empirical Study of Validating Synthetic Data for Formula Generation
Usneek Singh, José Cambronero, Sumit Gulwani, Aditya Kanade +4
2025-07-14