Related papers: Improved spread-location visualization
Regression experts consistently recommend plotting residuals for model diagnosis, despite the availability of many numerical hypothesis test procedures designed to use residuals to assess problems with a model fit. Here we provide evidence…
Plotting the residuals is a recommended procedure to diagnose deviations from linear model assumptions, such as non-linearity, heteroscedasticity, and non-normality. The presence of structure in residual plots can be tested using the lineup…
Rapid development of evolutionary algorithms in handling many-objective optimization problems requires viable methods of visualizing a high-dimensional solution set. Parallel coordinates which scale well to high-dimensional data are such a…
Scatterplots are a common tool for exploring multidimensional datasets, especially in the form of scatterplot matrices (SPLOMs). However, scatterplots suffer from overplotting when categorical variables are mapped to one or two axes, or the…
We propose a fold change visualization that demonstrates a combination of properties from log and linear plots of fold change. A useful fold change visualization can exhibit: (1) readability, where fold change values are recoverable from…
Scatter plots are popular for displaying 2D data, but in practice, many data sets have more than two dimensions. For the analysis of such multivariate data, it is often necessary to switch between scatter plots of different dimension pairs,…
Fully packed loop models describe the statistics of closely packed nested polygons on the square lattice. Many exact results can be obtained for these models, even for finite geometries, using their close relationship to alternating-sign…
3D scatterplots are a well-established plotting technique that can be used to represent data with three or more dimensions. On paper and computer monitors they are essentially two-dimensional projections of the three-dimensional Cartesian…
Scatterplots provide a visual representation of bivariate data (or 2D embeddings of multivariate data) that allows for effective analyses of data dependencies, clusters, trends, and outliers. Unfortunately, classical scatterplots suffer…
The overdraw problem of scatterplots seriously interferes with the visual tasks. Existing methods, such as data sampling, node dispersion, subspace mapping, and visual abstraction, cannot guarantee the correspondence and consistency between…
The spatial scan statistic is widely used in epidemiology and medical studies as a tool to identify hotspots of diseases. The classical spatial scan statistic assumes the number of disease cases in different locations have independent…
The paper advocates the use of a statistical tool dedicated to the exploration of data samples populated by several sources of events. This new technique, called sPlot, is able to unfold the contributions of the different sources to the…
This article inspects whether a multivariate distribution is different from a specified distribution or not, and it also tests the equality of two multivariate distributions. In the course of this study, a graphical tool-kit using…
The potential of location-shift models to find adequate models between the proportional odds model and the non-proportional odds model is investigated. It is demonstrated that these models are very useful in ordinal modeling. While…
A central concept in information visualization research and practice is the notion of visual variable effectiveness, or the perceptual precision at which values are decoded given visual channels of encoding. Formative work from Cleveland &…
We introduce an adaptive scattered data fitting scheme as extension of local least squares approximations to hierarchical spline spaces. To efficiently deal with non-trivial data configurations, the local solutions are described in terms of…
Multivariate location and scatter matrix estimation is a cornerstone in multivariate data analysis. We consider this problem when the data may contain independent cellwise and casewise outliers. Flat data sets with a large number of…
Overplotting of data points is a common problem when visualizing large datasets in a scatterplot, particularly when mapping nominal dimensions to one of the scatterplot axes. Transparency, aggregation, and jittering have previously been…
Residuals are a key component of diagnosing model fit. The usual practice is to compute standardized residuals using expected values and standard deviations of the observed data, then use these values to detect outliers and assess model…
We propose a new method for the construction and visualization of boxplot-type displays for functional data. We use a recent functional data analysis framework, based on a representation of functions called square-root slope functions, to…