Posts

AI Brain Animated

Artificial Intelligence of Art

Let’s Make some Art

Learning through experience is one of the best ways to learn. So this is how I went about educating myself about machine learning, artificial intelligence and neural networks. I am an artist and programmer that became interested in how computers could make art. So lets use artificial intelligence to create new artwork.

I decided to use an open source project on GitHub called art-DCGAN by Robbie Barrat. His implementation was used recently in Christie’s first Artificial Intelligence art auction. But I first encountered Robbie Barrat’s work on superRare.co

To start out we need some data to train the neural network. I wanted to use simple drawings to train my first model. I happen to have a library of art that I personally created. They are line drawings made using sound played through an oscilloscope. It turns out that most of the prebuilt code libraries utilize very small image sizes and you need at least 10,000 images to train them effectively.

Get Lots of Images

I happen to have movies of animated oscilloscope line drawings. Most of them come from this artwork called “Baby’s First Mobile”.

To generate a large number of images I turned the videos into individual frames using ffmpeg. Using commands like:

ffmpeg -i babyMobileVoice.mp4 -vf crop=500:500 thumb%04d.jpg -hide_banner

I was able to crop out the center of the video and produce individual frames. After running ffmpeg on all the oscilloscope videos I have and deleting all the black frames I ended up with over 10,000 images.

Make them Small

The next step is to make the images small. Some tools only accept 64 x 64 pixel images. The art-DCGAN uses 128 x 128 pixel images. I borrowed a python script from this blog at coria.com.   To double the number of images I read that someone rotated them by 90 degrees for additional training examples. Using these directions I added a rotate feature and made a copy of each image.

import sys
import os
import numpy as np
from os import walk
import cv2
import imutils

# width to resize
width = int(sys.argv[1])
# height to resize
height = int(sys.argv[2])
# location of the input dataset
input_dir = sys.argv[3]
# location of the output dataset
out_dir = sys.argv[4]

if len(sys.argv) != 5:
print("Please specify width, height, input directory and output directory.")
sys.exit(0)

print("Working...")

# get all the pictures in directory
images = []
ext = (".jpeg", ".jpg", ".png")

for (dirpath, dirnames, filenames) in walk(input_dir):
for filename in filenames:
if filename.endswith(ext):
images.append(os.path.join(dirpath, filename))

for image in images:
img = cv2.imread(image, cv2.IMREAD_UNCHANGED)
print(image) 
h, w = img.shape[:2]
pad_bottom, pad_right = 0, 0
if h < w:
ratio = w / h
else:
ratio = h / w

if h > height or w > width:
# shrinking image algorithm
interp = cv2.INTER_AREA
else:
# stretching image algorithm
interp = cv2.INTER_CUBIC

w = width
h = round(w / ratio)
if h > height:
h = int(height)
w = round(h * ratio)
pad_bottom = abs(height - h)
pad_right = abs(width - w)
print(w)
print(h) 
scaled_img = cv2.resize(img, (w, int(h)), interpolation=interp)
padded_img = cv2.copyMakeBorder(
scaled_img,0,pad_bottom,0,pad_right,borderType=cv2.BORDER_CONSTANT,value=[0,0,0])

cv2.imwrite(os.path.join(out_dir, os.path.basename(image)), padded_img)
rotate_img = imutils.rotate(padded_img, 90)
cv2.imwrite(os.path.join(out_dir, '90_'+os.path.basename(image)), rotate_img) 

print("Completed!")

Running this command

python resize_images.py 128 128 /origin /destination

will produce images of the proper size for the training to begin.

Setup an AWS GPU Instance

The GitHub art-DCGAN project has install directions but I will add to them here. In AWS I selected:

Ubuntu Server 14.04 LTS (HVM), SSD Volume Type – ami-06b5810be11add0e2 with the g3s.xlarge instance type.

Install all the Software

Once the instance started. I followed the directions on art-DCGAN install. Make sure to find the exact packages in the directions. On the Nvidia site you will need to press the legacy releases button to find the right version of the CUDA software. The deb file I used is cuda-repo-ubuntu1404-7-5-local_7.5-18_amd64.deb Newer updates of the software will not work with the art-DCGAN code.
Once it has finished building the code clone the art-DCGAN code. Now move your thousands of images into a folder called myimages/images. Make sure to have a folder called images inside the myimages folder as described in the project art-DCGAN was forked from.
Now you can run the command below but note the checkpoint options. This will cause it to save checkpoint models as it runs. The code tended to crash on me before it completed.
DATA_ROOT=myimages dataset=folder saveIter=10 name=scope1 ndf=50 ngf=150 th main.lua
saveIter will tell it to save a model for the discriminator and the generator after every 10 epochs. In my case an epoch was 318 training rounds.
But with these settings the training would constantly collapse. If the output of the training starts to show either the discriminator or the generator running at 0% your training has failed. To fix it I lowered the number of filters for the discriminator so it could keep up with the generator.
The GAN model is an adversarial training. A generator will create a random image and the discriminator will judge it based on the training source images. It repeats this process until the neural network is wired in such a way to produce images like the ones in the training set.
DATA_ROOT=myimages dataset=folder saveIter=10 name=scope1 ndf=12 ngf=150 th main.lua
is what worked for my particular artwork.
After running for five hours and making it to 90 epochs before failing I ended up with some images that look a lot like mine but with many variations.

Now the Artificial Intelligence model will make some new art

To generate a 36 image grid find the newest model file and use the command:
batchSize=36 net=./checkpoints/scope_90_net_G.t7 th generate.lua

AI Generated Oscilloscope Art

Artificial Intelligence Generated Oscilloscope Art

AI Generated Oscilloscope Art

AI Generated Oscilloscope Art

Now it becomes a matter of picking out the best work as an AI art curator.

With all of this work it becomes obvious that this is a very slow way to make art. It is still very much a human endeavor. But allowing a computer to generate lots of ideas for an artist to explore is helpful. Many artists have used randomized systems to generate art ideas. Artists such David Bowie, Brian Eno, Nam June Paik and John Cage have used mechanical and computer generators to automate some of their work and kick start the creative process.

In addition by making this model we have learned how to train other models that can be used for image recognition. You can also try out the other features of art-DCGAN that will download art from a public site and then generate new images based on that genre.

Technology Inspiring Art

Technology Inspiring Art

Technologists routinely get pegged in the geek category, but our roles also require us to come up with creative solutions to technical challenges. This creativity can help us extend into the realm of art. Recently, I used technology as a media for artistic expression while creating a sculpture entitled The Technologist.

Inspiration

Inspiration is so often fused from many memories, emotions, ideas, and events. My inspiration for The Technologist occurred while on vacation at a resort called Twin Farms in Vermont.

I was dancing with my young daughter in a recreation room of witty art to the sound of salsa music on the jukebox. The room contained a strange, playful set of old televisions with old school MTV graphics playing. It happened to be one of the twelve installations called Internet Dweller) by Nam June Paik from his exhibition Electronic Super Highway: Nam June Paik in the ‘90s.

Internet Dweller Nam June Paik

Background

Nam June Paik’s exhibitions were meant to be participatory, as many of his pieces allow the audience to manipulate the sound or video to make their own art. As a one time member of the Fluxus movement and a performance artist, Paik passionately encouraged everyone to participate in events that create art.

One of Paik’s pieces called Random Access allowed the viewer to create sounds with a wand that read magnetic tape attached to a wall. Another, entitled TV Crown, enabled the viewer to change the artistic patterns of lines on a TV screen. Other installations just put the viewer’s face right into the piece with a closed loop camera, like the Electronic Superhighway exhibit.

Creating The Technologist

Based on my encounter with Internet Dweller in Vermont and exposure to other work by Nam June Paik, I created The Technologist. My sculpture is comprised of a simple male wig head with CPU and memory chips – some carefully placed and others smashed into the surface. It also includes parts from a Flip camera and wireless routers. The Technologist‘s eye plays a five-minute video on an embedded iPod nano (4th generation) with sound from attached speakers or headphones.

Two angles of the Technologist

The video playing in The Technologist‘s eye is the first thing that pulls you into the sculpture. It includes footage from Paik’s piece with sights and sounds from the day I encountered it in Vermont. I was so busy recording the artwork that my daughter was begging me to dance more, “Daddy…dance after this picture. Dance!” So my daughter’s voice is also preserved for posterity in the piece.

I have been working on _The Technologist _for over a year as I refined the video and the installation into the head. At first, I attempted to use a Microsoft Zune since it has WiFi capability. The original vision was to produce a series of pieces that could be connected into a network in a future exhibition titled Temple of Technology. Unfortunately, the small Zune does not have a way to loop videos, and I did not want to invest the time in programming a video player for it.

Most iPod nanos have video playback ability and allow looping in a playlist. The iPod nano also places the power cord and headphone jack in a convenient position that allows the wires to run through the middle of the head. So it seemed like a much better solution.

Views of The Technologist — top row: Wild Side, Third Eye, Antenna & AMD CPU; bottom row: Processing Steps, Video Eye, Heads Up Controls

Layers of Meaning

The video in The Technologist‘s eye also contains QR codes, generated with QR Code Generator, that invite the viewer to explore further layers. This leads us to ask: what is the boundary of this art now that it has jumped into your smart phone?

This piece is partly an expression of the relationship between human individuality, spirituality, and natural rhythm and their conflict with the drive of technology. It contains multilayered ideas on this theme, including building loving relationships in the midst of ever increasing demands for efficiency. There is an emotional paradox expressed in the piece on the role of technique versus creativity and love.

The piece is filled with layers of meaning and emotion. So I leave it to you to discover what you see and feel in The Technologist. Please let me know what you find.

 

To learn more about Volume Labs and Volume Integration, please follow us on Twitter @volumeint and check out our website.