English
Related papers

Related papers: Customs Import Declaration Datasets

200 papers

Real-world data often exhibits bias, imbalance, and privacy risks. Synthetic datasets have emerged to address these issues. This paradigm relies on generative AI models to generate unbiased, privacy-preserving data while maintaining…

Custom officials across the world encounter huge volumes of transactions. With increased connectivity and globalization, the customs transactions continue to grow every year. Associated with customs transactions is the customs fraud - the…

Machine Learning · Computer Science 2023-08-22 Karandeep Singh , Yu-Che Tsai , Cheng-Te Li , Meeyoung Cha , Shou-De Lin

This article describes techniques employed in the production of a synthetic dataset of driver telematics emulated from a similar real insurance dataset. The synthetic dataset generated has 100,000 policies that included observations about…

Machine Learning · Statistics 2021-02-02 Banghee So , Jean-Philippe Boucher , Emiliano A. Valdez

Two elements have been essential to AI's recent boom: (1) deep neural nets and the theory and practice behind them; and (2) cloud computing with its abundant labeled data and large computing resources. Abundant labeled data is available for…

Databases · Computer Science 2019-10-09 Erik R. Altman

Using machine learning models to generate synthetic data has become common in many fields. Technology to generate synthetic transactions that can be used to detect fraud is also growing fast. Generally, this synthetic data contains only…

Machine Learning · Computer Science 2023-06-30 Shuo Wang , Terrence Tricco , Xianta Jiang , Charles Robertson , John Hawkin

The switch from a Model-Centric to a Data-Centric mindset is putting emphasis on data and its quality rather than algorithms, bringing forward new challenges. In particular, the sensitive nature of the information in highly regulated…

Machine Learning · Computer Science 2022-04-14 Giorgio Visani , Giacomo Graffi , Mattia Alfero , Enrico Bagli , Davide Capuzzo , Federico Chesani

Machine learning systems require representations of the real world for training and testing - they require data, and lots of it. Collecting data at scale has logistical and ethical challenges, and synthetic data promises a solution to these…

Computers and Society · Computer Science 2024-05-06 Cedric Deslandes Whitney , Justin Norman

Synthetic data serves as an alternative in training machine learning models, particularly when real-world data is limited or inaccessible. However, ensuring that synthetic data mirrors the complex nuances of real-world data is a challenging…

Machine Learning · Computer Science 2023-10-27 Lasse Hansen , Nabeel Seedat , Mihaela van der Schaar , Andrija Petrovic

Synthetic data generation, a cornerstone of Generative Artificial Intelligence, promotes a paradigm shift in data science by addressing data scarcity and privacy while enabling unprecedented performance. As synthetic data becomes more…

Machine Learning · Statistics 2024-03-12 Xiaotong Shen , Yifei Liu , Rex Shen

Synthetic data is often positioned as a solution to replace sensitive fixed-size datasets with a source of unlimited matching data, freed from privacy concerns. There has been much progress in synthetic data generation over the last decade,…

Machine Learning · Computer Science 2025-06-09 Graham Cormode , Samuel Maddock , Enayat Ullah , Shripad Gade

Synthetic data, or data generated by machine learning models, is increasingly emerging as a solution to the data access problem. However, its use introduces significant governance and accountability challenges, and potentially debases…

Computers and Society · Computer Science 2025-06-03 Madhavendra Thakur , Jason Hausenloy

Knowledge of the changing traffic is critical in risk management. Customs offices worldwide have traditionally relied on local resources to accumulate knowledge and detect tax fraud. This naturally poses countries with weak infrastructure…

Artificial Intelligence · Computer Science 2022-01-19 Sungwon Park , Sundong Kim , Meeyoung Cha

This research delves into the construction and utilization of synthetic datasets, specifically within the telematics sphere, leveraging OpenAI's powerful language model, ChatGPT. Synthetic datasets present an effective solution to…

Computers and Society · Computer Science 2023-06-27 Ryan Lingo

Autonomous driving techniques have been flourishing in recent years while thirsting for huge amounts of high-quality data. However, it is difficult for real-world datasets to keep up with the pace of changing requirements due to their…

Image and Video Processing · Electrical Eng. & Systems 2024-02-29 Zhihang Song , Zimin He , Xingyu Li , Qiming Ma , Ruibo Ming , Zhiqi Mao , Huaxin Pei , Lihui Peng , Jianming Hu , Danya Yao , Yi Zhang

Synthetic data has gained significant momentum thanks to sophisticated machine learning tools that enable the synthesis of high-dimensional datasets. However, many generation techniques do not give the data controller control over what…

Cryptography and Security · Computer Science 2022-11-22 Florimond Houssiau , Samuel N. Cohen , Lukasz Szpruch , Owen Daniel , Michaela G. Lawrence , Robin Mitra , Henry Wilde , Callum Mole

We study the human-in-the-loop customs inspection scenario, where an AI-assisted algorithm supports customs officers by recommending a set of imported goods to be inspected. If the inspected items are fraudulent, the officers can levy extra…

Machine Learning · Computer Science 2022-02-24 Sundong Kim , Tung-Duong Mai , Sungwon Han , Sungwon Park , Thi Nguyen Duc Khanh , Jaechan So , Karandeep Singh , Meeyoung Cha

With the rising adoption of Machine Learning across the domains like banking, pharmaceutical, ed-tech, etc, it has become utmost important to adopt responsible AI methods to ensure models are not unfairly discriminating against any group.…

Machine Learning · Computer Science 2022-12-02 Bhushan Chaudhari , Himanshu Chaudhary , Aakash Agarwal , Kamna Meena , Tanmoy Bhowmik

The widespread use of big data across sectors has raised major privacy concerns, especially when sensitive information is shared or analyzed. Regulations such as GDPR and HIPAA impose strict controls on data handling, making it difficult to…

Machine Learning · Computer Science 2025-12-10 Anantaa Kotal , Anupam Joshi

In the current data driven era, synthetic data, artificially generated data that resembles the characteristics of real world data without containing actual personal information, is gaining prominence. This is due to its potential to…

Machine Learning · Computer Science 2023-09-06 Tshilidzi Marwala , Eleonore Fournier-Tombs , Serge Stinckwich

While data sharing is crucial for knowledge development, privacy concerns and strict regulation (e.g., European General Data Protection Regulation (GDPR)) unfortunately limits its full effectiveness. Synthetic tabular data emerges as an…

Machine Learning · Computer Science 2021-08-24 Aditya Kunar
‹ Prev 1 2 3 10 Next ›