Related papers: Robust PDF Files Forensics Using Coding Style
Organizations publish and share more and more electronic documents like PDF files. Unfortunately, most organizations are unaware that these documents can compromise sensitive information like authors names, details on the information system…
In the recent years, Portable Document Format, commonly known as PDF, has become a democratized standard for document exchange and dissemination. This trend has been due to its characteristics such as its flexibility and portability across…
Malware scanners try to protect users from opening malicious documents by statically or dynamically analyzing documents. However, malware developers may apply evasions that conceal the maliciousness of a document. Given the variety of…
All methodologies for detecting plagiarism to date have focused on the final digital "outcome", such as a document or source code. Our novel approach takes the creation process into account using logged events collected by special software…
The popularity of the PDF format and the rich JavaScript environment that PDF viewers offer make PDF documents an attractive attack vector for malware developers. PDF documents present a serious threat to the security of organizations…
Malware still constitutes a major threat in the cybersecurity landscape, also due to the widespread use of infection vectors such as documents. These infection vectors hide embedded malicious code to the victim users, facilitating the use…
The increasing prevalence of malicious Portable Document Format (PDF) files necessitates robust and comprehensive feature extraction techniques for effective detection and analysis. This work presents a unified framework that integrates…
Over the last decade, malicious software (or malware, for short) has shown an increasing sophistication and proliferation, fueled by a flourishing underground economy, in response to the increasing complexity of modern defense mechanisms.…
Due to the popularity of portable document format (PDF) and increasing number of vulnerabilities in major PDF viewer applications, malware writers continue to use it to deliver malware via web downloads, email attachments and other methods…
Malicious PDF files represent one of the biggest threats to computer security. To detect them, significant research has been done using handwritten signatures or machine learning based on manual feature extraction. Those approaches are both…
Tampering or forgery of digital documents has become widespread, most commonly through altering images without any malicious intent such as enhancing the overall appearance of the image. However, there are occasions when tampering of…
There is a general belief that software must be able to easily do things that humans find difficult. Since finding sources for plagiarism in a text is not an easy task, there is a wide-spread expectation that it must be simple for software…
In recent years, as electronic files include personal records and business activities, these files can be used as important evidences in a digital forensic investigation process. In general, the data that can be verified using its own…
Malicious PDF documents present a serious threat to various security organizations that require modern threat intelligence platforms to effectively analyze and characterize the identity and behavior of PDF malware. State-of-the-art…
Whether a file is accepted by a single parser is not a reliable indication of whether a file complies with its stated format. Bugs within both the parser and the format specification mean that a compliant file may fail to parse, or that a…
Machine learning (ML)-based malware detection systems are becoming increasingly important as malware threats increase and get more sophisticated. PDF files are often used as vectors for phishing attacks because they are widely regarded as…
The Department of Homeland Security in the United States estimates that 90% of software vulnerabilities can be traced back to defects in design and software coding. The financial impact of these vulnerabilities has been shown to exceed 380…
In recent years, defect prediction has received a great deal of attention in the empirical software engineering world. Predicting software defects before the maintenance phase is very important not only to decrease the maintenance costs but…
Programming language detection is a common need in the analysis of large source code bases. It is supported by a number of existing tools that rely on several features, and most notably file extensions, to determine file types. We consider…
Source code plagiarism detection is a problem that has been addressed several times before; and several tools have been developed for that purpose. In this research project we investigated a set of possible disguises that can be…