Art Critic: An automatic painting classifier
Art Critic is a program that can be trained to identify a color
image of a painting as being an example of a given style of art.
We trained the system to distinguish Impressionist and post-Impressionist
art from other periods, including Cubist, Surrealist, and Baroque.
The system was only moderately successful, and suffered from several
limitations in its design.
Algorithm:
Classifying artwork requires relatively high-resolution renderings, resulting in raw data of very high dimensionality. In addition, much of the relevant information is local detail and texture.
We used a three-stage classification system, designed to reduce the volume
of the data without losing useful features and the local detail.
Data and Training:
Our data set consisted of 212 jpeg images of paintings by nine well-known artists, representing the Renaissance (Michelangelo), Baroque (Rembrandt, Rubens), Impressionism and post-Impressionism (Monet, Renoir, Cezanne, Van Gogh), Cubism (Braque, Picasso), and Surrealism (Dali). About half of the images were from Impressionists and post-Impressionists. Most images were between 500x500 and 1000x1000 pixels.
The neural network classifier was trained with 16800 labeled segments (the 100 largest segments from each of 168 images). 90% of the segments were used for training while the other 10% were held out for validation against overfitting. The remaining 44 images, never seen while training, were used to evaluate the full classification system.
Results:
Segmentation: Image segmentation results were frequently ugly, especially on
paintings with highly varying colors. Using segmentation instead of a simple
color histogram was probably not helpful in these cases. On other images,
the segmentation captured many human-recognizable features and was likely
a constructive processing step.
Above: Van Gogh self-portrait with noisy segmentation
Classification: The neural network design went through three evolutionary stages as we tried to improve its performance:
Analysis:
There are several basic problems involved in classifying impressionist
works:
1) Most impressionist works use a great deal of color. Because they are
frequently of natural scenes, the color is often predominantly green
or blue. Because of this, the classifier tends to mark any blue or green
region as being impressionist, and any dark region as being
non-impressionist. This is fostered by the fact that a region classifier
has little to go on - the six color features are vital for
classification. It tends to result in a binary red/green classifier,
however. This would be alleviated by considering multiple regions.
Even taking multiple regions into consideration, color is not sufficient
to disambiguate a painting's stylistic category. Many painters used
palettes that could belong to either category. Braque is a good example of
this:
Monet: Impressionist |
Braque: Cubist |
Rembrandt: Baroque |
Braque: Cubist |
2) Most impressionist works have soft lines, making the breaking of the
image into regions difficult. Because the segmentation algorithm is
limited to a certain sensitivity, the tendancy in such a case is to end up
with either an image with many small, uninformative regions (high merging
threshold), or one large, uninformative region (low merging
threshold). The blurrier the lines, the narrower the good cross-over
area.
This could potentially be a useful feature, but quantifying 'line softness'
is an iffy matter at best.
3) Most impressionist works are heavily textured. This has a similar effect on the segmentation as (2), but with the additional difficulty that many other periods also use heavy texture. Cubism, in particular, uses a similar style.
4) Impressionism is basically a context-driven distinction - each "distinguishing" feature could also be used to distinguish some other group. Indeed, it is not the features themselves that make a painting "impressionist" - whether it is impressionist is defined by the difference between the thing represented and the representation. Such a distinction cannot be practically captured. Consequently, any categorizing algorithm will be more a heuristic than anything else.
The third problem is actually one of the major reasons that it is advantageous to both initially segment the image and then merge compatible regions: without such merging, the regions resulting from a highly-textured impressionist work and a smooth traditional painting are comparable. It is only by combining regions that you get large tell-tale regions in the non-impressionist paintings while leaving a number of smaller regions in the impressionist paintings. It also prevents anomolous areas from gaining too much weight.
These problems make a good balance of features particularly important. There should be enough line information to help offset the overweighted color information, and enough regions considered to prevent a painting being classified by an anomolous region.
The only line type features our algorithm considered were boundary length
(length around a region), and average curvature (higher curvature being
more extreme).
The first feature did little good because as well as considering the major
regions, we also considered smaller regions to allow more data. For major
regions, boundary length may give a good indication of painting type,
because impressionist works tend to end up with more, smaller regions, but
the majority of regions analyzed would fall in the medium or small range -
more or less the same regardless of type.
The average curvature was a potentially useful feature; the main problem
with it was that it did not consider wide enough a window in making its
curvature judgements. To properly recognize the difference between these two line segments
it is necessary to consider a window of at least two pixels in every direction. However, this becomes intractable very quickly.
Combinations of regions:
For these reasons, to get decent results for an overall picture, it would
be necessary to consider a number of regions at the same time. The
simplistic difficulty with this is that the number of regions vary, so
there has to be at least some degree of seperation between the image
analysis and individual region analysis.
We tried two voting-type techniques for taking multiple (5-10) regions
into account at the same time, with limited success.
The first, a simple voting scheme, worked moderately well, boosting the
accuracy by an average of 5 percent. The obvious problem with this is that
it gives all results equal weight, when in fact some might be far more
important than others.
In an attempt to solve this problem, we linked several evaluated regions
together using another neural network. This actually ended up giving worse
percentages than the regions taken individually. Here the problem was
probably lack of data - because our limited training data was limited even
more by analyzing several regions together, there was insufficient
examples from which to extrapolate a reasonable rule.
Although the end algorithm was only moderately successful, given more data
it should not be difficult to get a good classifier. The most notable
limitation of our implementation lay in the fact that we mostly considered
regions individually. On the regional level, it is very difficult to tell
the difference between a painting in one style and a painting in another -
many of the components are very much the same. This painting may have this
nice little green region - it doesn't mean that the painting is
impressionist. To effectively solve this problem, a network combining most
of the region results would be necessary. To make a good classifier, it
would probably require the combination of 50-100 regions.
Although this would require a huge amount of data to train properly, the
initial findings we have shown here suggest that it could be quite
successful. Such an extension could also benefit from further features,
but this might not be feasible, since the more complicated features often
take an impractical amount of time to derive.
Related work
Image segmentation:
Image classification:
Image archive: