Art Critic
CS229 final project
Annaka Kalton and Greg Parker


Art Critic: An automatic painting classifier
Art Critic is a program that can be trained to identify a color image of a painting as being an example of a given style of art. We trained the system to distinguish Impressionist and post-Impressionist art from other periods, including Cubist, Surrealist, and Baroque. The system was only moderately successful, and suffered from several limitations in its design.

Classifying artwork requires relatively high-resolution renderings, resulting in raw data of very high dimensionality. In addition, much of the relevant information is local detail and texture.
We used a three-stage classification system, designed to reduce the volume of the data without losing useful features and the local detail.

  1. Segmentation
    The image is divided into many small pieces. Each piece is analyzed separately in the later stages.
    We used two methods for image segmentation: a fast graph partitioning algorithm for the primary segmentation, and a slower region competition algorithm to fine-tune the segmentation and eliminate small segments.

  2. Feature extraction
    Color and shape properties are extracted from each segment separately. We extracted ten features: average color and color variance for red, green, and blue; border length; enclosed area; average and absolute curvature.
    By examining features of a single segment rather than the entire image, we preserve local details that otherwise would have been lost. For example, an image contructed from a solid red region and a solid green region would have high overall variance, but a segmentation that separated the two colors would instead see low variance at the local level.

  3. Classification
    The feature values from one segment are fed into a neural network trained to recognize segments from impressionist images. Results the 100 largest segments of an image are combined to get the final classification of the image.
    We used single-hidden-layer networks with between 50 and 100 hidden nodes for segment classification, and a simple vote to decide the final classification.

Above: Segmentation demonstration. Right-hand side is result after graph partitioning. Left side is result after further processing by region competition.

Data and Training:
Our data set consisted of 212 jpeg images of paintings by nine well-known artists, representing the Renaissance (Michelangelo), Baroque (Rembrandt, Rubens), Impressionism and post-Impressionism (Monet, Renoir, Cezanne, Van Gogh), Cubism (Braque, Picasso), and Surrealism (Dali). About half of the images were from Impressionists and post-Impressionists. Most images were between 500x500 and 1000x1000 pixels.
The neural network classifier was trained with 16800 labeled segments (the 100 largest segments from each of 168 images). 90% of the segments were used for training while the other 10% were held out for validation against overfitting. The remaining 44 images, never seen while training, were used to evaluate the full classification system.

Segmentation: Image segmentation results were frequently ugly, especially on paintings with highly varying colors. Using segmentation instead of a simple color histogram was probably not helpful in these cases. On other images, the segmentation captured many human-recognizable features and was likely a constructive processing step.

Above: Van Gogh self-portrait with noisy segmentation

Classification: The neural network design went through three evolutionary stages as we tried to improve its performance:

Classifier 1: 50% accuracy on segments, no better than random.
This classifier used segments directly from the graph segmentation, before region competition had improved the segmentation. Including region competition in later classifiers helped immensely.

Classifier 2: 70% accuracy on segments
After being trained on post-competition segments, this classifier showed much promise, until it was realized that it's accuracy was essentially 100% on Impressionists and post-Impressionists, 100% on Rembrandt and Rubens, and close to zero on all other non-Impressionists.
The Baroque art by Rembrandt and Rubens is easily recognizable by its limited color set of red and brown, while the Impressionists generally included lots of green. The classifier was doing little more than equating red with Baroque and green with Impressionism, thus failing on most other images.

Classifier 3: 60% accuracy on segments, 66% accuracy on images.
This final classifier was trained with the same data as Classifier 2 except that all Rembrandt images were thrown out. The resulting system is much better at recognizing most non-Impressionists and is still nearly 100% accurate on Rembrandt.
The final results can be changed by shifting the bias of the vote counter: putting the boundary at 50% recognizes nearly all Impressionist images correctly with many false positive, while requiring an Impressionist image to have 80% of its segments classified Impressionist is about 2/3 correct with most artists.

Above: a typically earth-toned Rembrandt painting

There are several basic problems involved in classifying impressionist works:
1) Most impressionist works use a great deal of color. Because they are frequently of natural scenes, the color is often predominantly green or blue. Because of this, the classifier tends to mark any blue or green region as being impressionist, and any dark region as being non-impressionist. This is fostered by the fact that a region classifier has little to go on - the six color features are vital for classification. It tends to result in a binary red/green classifier, however. This would be alleviated by considering multiple regions.
Even taking multiple regions into consideration, color is not sufficient to disambiguate a painting's stylistic category. Many painters used palettes that could belong to either category. Braque is a good example of this:

Monet: Impressionist

Braque: Cubist

Rembrandt: Baroque

Braque: Cubist

2) Most impressionist works have soft lines, making the breaking of the image into regions difficult. Because the segmentation algorithm is limited to a certain sensitivity, the tendancy in such a case is to end up with either an image with many small, uninformative regions (high merging threshold), or one large, uninformative region (low merging threshold). The blurrier the lines, the narrower the good cross-over area.
This could potentially be a useful feature, but quantifying 'line softness' is an iffy matter at best.

3) Most impressionist works are heavily textured. This has a similar effect on the segmentation as (2), but with the additional difficulty that many other periods also use heavy texture. Cubism, in particular, uses a similar style.

4) Impressionism is basically a context-driven distinction - each "distinguishing" feature could also be used to distinguish some other group. Indeed, it is not the features themselves that make a painting "impressionist" - whether it is impressionist is defined by the difference between the thing represented and the representation. Such a distinction cannot be practically captured. Consequently, any categorizing algorithm will be more a heuristic than anything else.

The third problem is actually one of the major reasons that it is advantageous to both initially segment the image and then merge compatible regions: without such merging, the regions resulting from a highly-textured impressionist work and a smooth traditional painting are comparable. It is only by combining regions that you get large tell-tale regions in the non-impressionist paintings while leaving a number of smaller regions in the impressionist paintings. It also prevents anomolous areas from gaining too much weight.

These problems make a good balance of features particularly important. There should be enough line information to help offset the overweighted color information, and enough regions considered to prevent a painting being classified by an anomolous region.

The only line type features our algorithm considered were boundary length (length around a region), and average curvature (higher curvature being more extreme).
The first feature did little good because as well as considering the major regions, we also considered smaller regions to allow more data. For major regions, boundary length may give a good indication of painting type, because impressionist works tend to end up with more, smaller regions, but the majority of regions analyzed would fall in the medium or small range - more or less the same regardless of type.
The average curvature was a potentially useful feature; the main problem with it was that it did not consider wide enough a window in making its curvature judgements. To properly recognize the difference between these two line segments

it is necessary to consider a window of at least two pixels in every direction. However, this becomes intractable very quickly.

Combinations of regions: For these reasons, to get decent results for an overall picture, it would be necessary to consider a number of regions at the same time. The simplistic difficulty with this is that the number of regions vary, so there has to be at least some degree of seperation between the image analysis and individual region analysis.
We tried two voting-type techniques for taking multiple (5-10) regions into account at the same time, with limited success.
The first, a simple voting scheme, worked moderately well, boosting the accuracy by an average of 5 percent. The obvious problem with this is that it gives all results equal weight, when in fact some might be far more important than others.
In an attempt to solve this problem, we linked several evaluated regions together using another neural network. This actually ended up giving worse percentages than the regions taken individually. Here the problem was probably lack of data - because our limited training data was limited even more by analyzing several regions together, there was insufficient examples from which to extrapolate a reasonable rule.

Although the end algorithm was only moderately successful, given more data it should not be difficult to get a good classifier. The most notable limitation of our implementation lay in the fact that we mostly considered regions individually. On the regional level, it is very difficult to tell the difference between a painting in one style and a painting in another - many of the components are very much the same. This painting may have this nice little green region - it doesn't mean that the painting is impressionist. To effectively solve this problem, a network combining most of the region results would be necessary. To make a good classifier, it would probably require the combination of 50-100 regions.
Although this would require a huge amount of data to train properly, the initial findings we have shown here suggest that it could be quite successful. Such an extension could also benefit from further features, but this might not be feasible, since the more complicated features often take an impractical amount of time to derive.

Related work
Image segmentation:

  1. "Blobworld: Image segmentation using Expectation-Maximization and its application to image querying"
        C. Carson, S. Belongie, H. Greenspan and J. Matik
  2. "Image Segmentation Using Local Variation"
        P. Felzenszwalb, D. Huttenlocher
  3. "A fast algorithm for MDL-based multi-band image segmentation"
        T. Kanungo, B. Dom, W. Niblack and D. Steele
  4. "Snakes: Active Contour Models"
        M. Kass, A. Witkin and D. Terzopoulos
  5. "Note on Active Contour Models and Balloons"
        L. Cohen

Image classification:

  1. "Indoor-Outdoor Image Classification"
        Martin Szummer and Rosalind W. Picard

Image archive:

  1. "The Artchive"
        Mark Harden

Sealie Software